apoor package¶
Subpackages¶
Module contents¶
A small personal package created to store code and data I often reuse.
I’ll continue to update it with useful functions that I find myself reusing. The apoor.data module has some common datasets and functions for reading them in as pandas DataFrames.
-
apoor.
fdir
(o: Any) → List[str]¶ Filtered dir(). Same as builtin dir() function without private attributes.
Parameters: o – Object being inspected Returns: “Public attributes” of o
-
apoor.
ibuff
(itr: Iterable[T_co], bsize: int = 1) → Iterable[List[T]]¶ Creates an iterable that yields elements from
itr
grouped into lists of sizebsize
.If
itr
can’t evenly be grouped into lists of sizebsize
, the final list will have the remaining elements.Parameters: - itr – The interable to be buffered.
- bsize –
Positive integer, representing the number of values from
itr
to be yielded together.The final list yielded may not be of size
bsize
iflen(itr)
doesn’t evenly divide into groups ofbsize
.
Yields: Buffered elements from
itr
, grouped into lists of size up tobsize
.Raises: - TypeError – If
bsize
isn’t an integer. - ValueError – If
bsize
isn’t positive.
Examples
>>> for b in apoor.ibuff(range(10),3): ... print(b) [0, 1, 2] [3, 4, 5] [6, 7, 8] [9]
-
apoor.
make_scale
(dmin: float, dmax: float, rmin: float, rmax: float, clamp: bool = False) → Callable[[float], float]¶ Scale function factory.
Creates a scale function to map a number from a domain to a range.
Parameters: - dmin – Domain’s start value
- dmax – Domain’s end value
- rmin – Range’s start value
- rmax – Range’s end value
- clamp – If the result is outside the range, return clamped value (default: False)
Returns: A scale function taking one numeric argument and returns the value mapped from the domain to the range (and clamped if clamp flag is set).
Examples
>>> s = make_scale(0,1,0,10) >>> s(0.1) 1.0
>>> s = make_scale(0,10,10,0) >>> s(1.0) 9.0
>>> s = make_scale(0,1,0,1,clamp=True) >>> s(100) 1.0
-
apoor.
set_seed
(n: int)¶ Sets numpy’s random seed.
Parameters: n (int) – The value used to set numpy’s random seed.
-
apoor.
to_onehot
(y: numpy.ndarray, num_classes: int = None, dtype='float32') → numpy.ndarray¶ Expands a 1D categorical vector to a 2D, onehot-encoded categorical matrix.
Parameters: - y – 1D categorical vector
- num_classes –
Number of categories in (and width of) the output matrix.
If
num_classes
isNone
, setstomax(y) + 1
. - dtype – Data type of output matrix
Returns: 2D one-hot encoded category matrix
Examples
>>> data = np.array([0,2,1,3]) >>> apoor.to_onehot(data) array([[1., 0., 0., 0.], [0., 0., 1., 0.], [0., 1., 0., 0.], [0., 0., 0., 1.]])
-
apoor.
train_test_split
(*arrays, test_pct: float = 0.15, val_set: bool = False, val_pct: float = 0.15) → Tuple[numpy.ndarray]¶ Splits arrays into train & test sets.
Splits arrays into train, test, and (optionally) validation sets using the supplied percentages.
Parameters: - *arrays –
An arbitrary number of sequences to be split into train, test, and (optionally) validation sets. Must have at least one array.
- test_pct –
Float in the range
[0,1]
. Percent of totaln
values to include in test set.The train set will have 1.0 - test_pct pct of values (or 1.0 - test_pct - val_pct pct of values if val_set == True).
- val_set – Whether or not to return a validation set, in addition to a test set.
- val_pct –
float in the range
[0,1]
. Percent of total n values to include in test set.Ignored if
val_set == False
.The train set will have
1.0 - test_pct - val_pct
pct of values.
Returns: splits tuple of numpy arrays. Input arrays split into train, test, val sets.
If
val_set == False
,len(splits) == 2 * len(arrays)
, or ifval_set == True
,len(splits) == 3 * len(arrays)
.Example
>>> x = np.arange(10) >>> train_test_split(x) (array([3, 9, 4, 2, 1, 0, 7, 5, 8]), array([6]))
>>> x = np.arange(10) >>> y = x[::-1] >>> x_train, x_test, y_train, y_test = train_test_split(x,y) >>> x_train, x_test, y_train, y_test (array([1, 3, 5, 8, 4, 7, 6, 9]), array([0, 2]), array([8, 6, 4, 1, 5, 2, 3, 0]), array([9, 7]))
>>> train_test_split(x,test_pct=0.3,val_set=True,val_pct=0.2) (array([0, 9, 5, 7, 6, 2, 8]), array([1, 3, 4]), array([3, 4]))
- *arrays –