apoor package¶

Subpackages¶

apoor.data package
- Module contents

Module contents¶

A small personal package created to store code and data I often reuse.

I’ll continue to update it with useful functions that I find myself reusing. The apoor.data module has some common datasets and functions for reading them in as pandas DataFrames.

apoor.fdir(o: Any) → List[str]¶

Filtered dir(). Same as builtin dir() function without private attributes.

Parameters:	o – Object being inspected
Returns:	“Public attributes” of o

apoor.ibuff(itr: Iterable[T_co], bsize: int = 1) → Iterable[List[T]]¶

Creates an iterable that yields elements from itr grouped into lists of size bsize.

If itr can’t evenly be grouped into lists of size bsize, the final list will have the remaining elements.

Parameters:	itr – The interable to be buffered. bsize – Positive integer, representing the number of values from `itr` to be yielded together. The final list yielded may not be of size `bsize` if `len(itr)` doesn’t evenly divide into groups of `bsize`.
Yields:	Buffered elements from `itr`, grouped into lists of size up to `bsize`.
Raises:	TypeError – If `bsize` isn’t an integer. ValueError – If `bsize` isn’t positive.

Examples

>>> for b in apoor.ibuff(range(10),3):
...     print(b)
[0, 1, 2]
[3, 4, 5]
[6, 7, 8]
[9]

apoor.make_scale(dmin: float, dmax: float, rmin: float, rmax: float, clamp: bool = False) → Callable[[float], float]¶

Scale function factory.

Creates a scale function to map a number from a domain to a range.

Parameters:	dmin – Domain’s start value dmax – Domain’s end value rmin – Range’s start value rmax – Range’s end value clamp – If the result is outside the range, return clamped value (default: False)
Returns:	A scale function taking one numeric argument and returns the value mapped from the domain to the range (and clamped if clamp flag is set).

Examples

>>> s = make_scale(0,1,0,10)
>>> s(0.1)
1.0

>>> s = make_scale(0,10,10,0)
>>> s(1.0)
9.0

>>> s = make_scale(0,1,0,1,clamp=True)
>>> s(100)
1.0

apoor.set_seed(n: int)¶

Sets numpy’s random seed.

Parameters:	n (int) – The value used to set numpy’s random seed.

apoor.to_onehot(y: numpy.ndarray, num_classes: int = None, dtype='float32') → numpy.ndarray¶

Expands a 1D categorical vector to a 2D, onehot-encoded categorical matrix.

Parameters:	y – 1D categorical vector num_classes – Number of categories in (and width of) the output matrix. If `num_classes` is `None`, setsto `max(y) + 1`. dtype – Data type of output matrix
Returns:	2D one-hot encoded category matrix

Examples

>>> data = np.array([0,2,1,3])
>>> apoor.to_onehot(data)
array([[1., 0., 0., 0.],
       [0., 0., 1., 0.],
       [0., 1., 0., 0.],
       [0., 0., 0., 1.]])

apoor.train_test_split(*arrays, test_pct: float = 0.15, val_set: bool = False, val_pct: float = 0.15) → Tuple[numpy.ndarray]¶

Splits arrays into train & test sets.

Splits arrays into train, test, and (optionally) validation sets using the supplied percentages.

Parameters:

*arrays –
An arbitrary number of sequences to be split into train, test, and (optionally) validation sets. Must have at least one array.
test_pct –
Float in the range [0,1]. Percent of total n values to include in test set.

The train set will have 1.0 - test_pct pct of values (or 1.0 - test_pct - val_pct pct of values if val_set == True).
val_set – Whether or not to return a validation set, in addition to a test set.
val_pct –
float in the range [0,1]. Percent of total n values to include in test set.

Ignored if val_set == False.

The train set will have 1.0 - test_pct - val_pct pct of values.

Returns:

splits tuple of numpy arrays. Input arrays split into train, test, val sets.

If val_set == False, len(splits) == 2 * len(arrays), or if val_set == True, len(splits) == 3 * len(arrays).

Example

>>> x = np.arange(10)
>>> train_test_split(x)
(array([3, 9, 4, 2, 1, 0, 7, 5, 8]), array([6]))

>>> x = np.arange(10)
>>> y = x[::-1]
>>> x_train, x_test, y_train, y_test = train_test_split(x,y)
>>> x_train, x_test, y_train, y_test
(array([1, 3, 5, 8, 4, 7, 6, 9]),
 array([0, 2]),
 array([8, 6, 4, 1, 5, 2, 3, 0]),
 array([9, 7]))

>>> train_test_split(x,test_pct=0.3,val_set=True,val_pct=0.2)
(array([0, 9, 5, 7, 6, 2, 8]),
 array([1, 3, 4]),
 array([3, 4]))

apoor package¶

Subpackages¶

Module contents¶

apoor

Navigation

Related Topics