Data Types | Python

Session Overview

  • Scalars
  • Vectors
  • Arrays/Matrices
  • Data Frames
  • Indexing

Let’s Play!

  • Log Into Platform http://esds.ncics.org

  • Select Previous Workspace then JupyterLab

  • Create a Python File (or Jupyter Notebook) and REPL

Scalars

What are the data types in Python?

Vectors (Lists)

a = [1, 2, 3]

b = list(range(1, 3))

a + b

a * b

a * 2


hello_phrase = ['Hello', 'World', '!']

''.join(hello_phrase)


hello_phrase[1]
hello_phrase[3]
hello_phrase[0]
a[1] + a[2]


a[1:2]
a[0:2]

len(a)


test_vector = [0, 1, 'Three']

test_vector

Arrays & Matrices

nested_list = [[1, 4, 5, 12], 
    [-5, 8, 9, 0],
    [-6, 7, 11, 19]]


nested_list

nested_list[1]
nested_list[1][2]
nested_list[0][-1]


import numpy as np
numpy_a = np.array(a)
type(numpy_a)

numpy_a + numpy_a

numpy_a * numpy_a


numpy_matrix = np.array(nested_list)

numpy_matrix

numpy_matrix + numpy_matrix

numpy_matrix[:,0:3].dot(numpy_matrix)


Data Frames

Tables!

import pandas as pd

initial_dataframe = pd.DataFrame(
    {
        "A": 1.0,
        "B": pd.Timestamp("20130102"),
        "C": pd.Series(1, index=list(range(4)), dtype="float32"),
        "D": np.array([3] * 4, dtype="int32"),
        "E": pd.Categorical(["test", "train", "test", "train"]),
        "F": "foo",
    }
)

initial_dataframe.dtypes


dates = pd.date_range("20130101", periods=6)

df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list("ABCD"))

df

df["A"]

df[0:3]

df["20130102":"20130104"]

df.loc[dates[0]]
df.loc["20130102", ["A", "B"]]

df.iloc[3:5, 0:2]
df.iloc[1:3, :]

df[df["A"] > 0]


df[df["A"] > 0].sum()

Dictionaries

initial_dictionary = {
  "numpy_array": numpy_a,
  "dataframe": df,
  "number": 2,
  "text": "text"
}


initial_dictionary['number']
initial_dictionary['dataframe']