Arrays reorganizing#
asarray()
#
The asarray()
function is used when you want to convert an input to an array. The input could be a lists, tuple, ndarray, etc.
Syntax:
numpy.asarray(data, dtype=None, order=None)[source]
data
: Data that you want to convert to an arraydtype
: This is an optional argument. If not specified, the data type is inferred from the input dataOrder
: Default isC
which is an essential row style. Other option isF
(Fortan-style)
# Consider the following 2-D matrix with four rows and four columns filled by 1
import numpy as np
a = np.matrix(np.ones((4,4)))
If you want to change the value of the matrix, you cannot. The reason is, it is not possible to change a copy.
np.array(a)[2]=3
print(a) # value won't change in result
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
Matrix is immutable. You can use asarray if you want to add modification in the original array. Let’s see if any change occurs when you want to change the value of the third rows with the value 2
np.asarray(a)[2]=2 # np.asarray(A): converts the matrix A to an array
print(a)
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[2. 2. 2. 2.]
[1. 1. 1. 1.]]
arange()
#
The arange()
is an inbuilt numpy function that returns an ndarray object containing evenly spaced values within a defined interval. For instance, you want to create values from 1 to 10; you can use arange()
function.
Syntax:
numpy.arange(start, stop,step)
start
: Start of intervalstop
: End of intervalstep
: Spacing between values. Default step is 1
# Example 1:
import numpy as np
np.arange(1, 11)
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
If you want to change the step, you can add a third number in the parenthesis. It will change the step.
# Example 2:
import numpy as np
np.arange(1, 14, 4)
array([ 1, 5, 9, 13])
np.arange(0,11,2) # even no by adding a step size
array([ 0, 2, 4, 6, 8, 10])
np.arange(1,11,2) # odd no
array([1, 3, 5, 7, 9])
Reshape Data#
In some occasions, you need to reshape the data from wide to long. You can use the reshape function for this.
Syntax:
numpy.reshape(a, newShape, order='C')
a: Array
that you want to reshapenewShape
: The new desires shapeorder
: Default isC
which is an essential row style.
import numpy as np
e = np.array([(1,2,3), (4,5,6)])
print(e)
e.reshape(3,2)
[[1 2 3]
[4 5 6]]
array([[1, 2],
[3, 4],
[5, 6]])
Broadcasting with array reorganizing#
It’s super cool and super useful. The one-line explanation is that when doing elementwise operations, things expand to the “correct” shape.
# add a scalar to a 1-d array
x = np.arange(5)
print('x: ', x)
print('x+1:', x + 1, end='\n\n')
y = np.random.uniform(size=(2, 5))
print('y: ', y, sep='\n')
print('y+1:', y + 1, sep='\n')
x: [0 1 2 3 4]
x+1: [1 2 3 4 5]
y:
[[0.20040757 0.96248829 0.29534113 0.14204329 0.00786338]
[0.68572271 0.22357762 0.88618684 0.91174698 0.87380356]]
y+1:
[[1.20040757 1.96248829 1.29534113 1.14204329 1.00786338]
[1.68572271 1.22357762 1.88618684 1.91174698 1.87380356]]
Since x
is shaped (5,)
and y
is shaped (2,5)
we can do operations between them.
x * y
array([[0. , 0.96248829, 0.59068226, 0.42612988, 0.03145352],
[0. , 0.22357762, 1.77237368, 2.73524094, 3.49521424]])
Without broadcasting we’d have to manually reshape our arrays, which quickly gets annoying.
x.reshape(1, -1).repeat(2, axis=0) * y
array([[0. , 0.96248829, 0.59068226, 0.42612988, 0.03145352],
[0. , 0.22357762, 1.77237368, 2.73524094, 3.49521424]])
before = np.array([[1,2,3,4],[5,6,7,8]])
print(before)
after = before.reshape((2,4))
print(after)
[[1 2 3 4]
[5 6 7 8]]
[[1 2 3 4]
[5 6 7 8]]
Flatten Data#
When you deal with some neural network like convnet, you need to flatten the array. You can use flatten()
.
Syntax:
numpy.flatten(order='C')
a: Array
that you want to reshapenewShape
: The new desires shapeorder
: Default isC
which is an essential row style.
e.flatten()
array([1, 2, 3, 4, 5, 6])
What is hstack
?#
With hstack
you can appened data horizontally. This is a very convinient function in Numpy. Lets study it with an example:
## Horitzontal Stack
import numpy as np
f = np.array([1,2,3])
g = np.array([4,5,6])
print('Horizontal Append:', np.hstack((f, g)))
Horizontal Append: [1 2 3 4 5 6]
# Horizontal stack
h1 = np.ones((2,4))
h2 = np.zeros((2,2))
np.hstack((h1,h2))
array([[1., 1., 1., 1., 0., 0.],
[1., 1., 1., 1., 0., 0.]])
What is vstack
?#
With vstack
you can appened data vertically. Lets study it with an example:
## Vertical Stack
import numpy as np
f = np.array([1,2,3])
g = np.array([4,5,6])
print('Vertical Append:', np.vstack((f, g)))
Vertical Append: [[1 2 3]
[4 5 6]]
# Vertically stacking vectors
v1 = np.array([1,2,3,4])
v2 = np.array([5,6,7,8])
np.vstack([v1,v2,v1,v2])
array([[1, 2, 3, 4],
[5, 6, 7, 8],
[1, 2, 3, 4],
[5, 6, 7, 8]])
Generate Random Numbers#
To generate random numbers for Gaussian distribution use:
Syntax:
numpy.random.normal(loc, scale, size)
loc
: the mean. The center of distributionscale
: standard deviation.size
: number of returns
## Generate random number from normal distribution
normal_array = np.random.normal(5, 0.5, 10)
print(normal_array)
[4.73370838 5.02319487 5.03247675 4.63238643 5.20491465 4.23379671
4.65235094 4.03461004 4.81788758 5.40603744]
Linspace#
Linspace gives evenly spaced samples.
Syntax:
numpy.linspace(start, stop, num, endpoint)
start
: Start of sequencestop
: End of sequencenum
: Number of samples to generate. Default is 50endpoint
: IfTrue
(default), stop is the last value. IfFalse
, stop value is not included.
# Example:
import numpy as np
np.linspace(0,10,6)
array([ 0., 2., 4., 6., 8., 10.])
# Example: For instance, it can be used to create 10 values from 1 to 5 evenly spaced.
import numpy as np
np.linspace(1.0, 5.0, num=10)
array([1. , 1.44444444, 1.88888889, 2.33333333, 2.77777778,
3.22222222, 3.66666667, 4.11111111, 4.55555556, 5. ])
If you do not want to include the last digit in the interval, you can set endpoint to False
np.linspace(1.0, 5.0, num=5, endpoint=False)
array([1. , 1.8, 2.6, 3.4, 4.2])
LogSpace#
LogSpace returns even spaced numbers on a log scale. Logspace has the same parameters as np.linspace
.
Syntax:
numpy.logspace(start, stop, num, endpoint)
start
: Start of sequencestop
: End of sequencenum
: Number of samples to generate. Default is 50endpoint
: IfTrue
(default), stop is the last value. IfFalse
, stop value is not included.
# Example:
np.logspace(3.0, 4.0, num=4)
array([ 1000. , 2154.43469003, 4641.58883361, 10000. ])
Finaly, if you want to check the memory size of an element in an array, you can use .itemsize
x = np.array([1,2,3], dtype=np.complex128)
x.itemsize
16
Statistics#
NumPy has quite a few useful statistical functions for finding minimum, maximum, percentile standard deviation and variance, etc from the given elements in the array. The functions are explained as follows −
Numpy is equipped with the robust statistical function as listed below:
Function |
Numpy |
---|---|
|
np.min() |
|
np.max() |
|
np.mean() |
|
np.median() |
|
np.std() |
# Consider the following Array
import numpy as np
normal_array = np.random.normal(5, 0.5, 10)
print(normal_array)
[4.57057605 5.76249894 4.09935016 5.38336447 4.96652011 5.29402596
4.79025614 6.00232788 5.49574452 5.60059908]
# Example:Statistical function
### Min
print(np.min(normal_array))
### Max
print(np.max(normal_array))
### Mean
print(np.mean(normal_array))
### Median
print(np.median(normal_array))
### Sd
print(np.std(normal_array))
4.099350157876355
6.002327879593869
5.196526332130161
5.338695217273813
0.5550161930871431
stats = np.array([[1,2,3],[4,5,6]])
stats
array([[1, 2, 3],
[4, 5, 6]])
np.min(stats)
1
np.max(stats, axis=1)
array([3, 6])
np.sum(stats, axis=0)
array([5, 7, 9])
Miscellaneous#
Load Data from File#
you can download the “data.txt” from here
filedata = np.genfromtxt('data.txt', delimiter=',')
filedata = filedata.astype('int32') # you can also change type to 'int64'
print(filedata)
[[ 1 13 21 11 196 75 4 3 34 6 7 8 0 1 2 3 4 5]
[ 3 42 12 33 766 75 4 55 6 4 3 4 5 6 7 0 11 12]
[ 1 22 33 11 999 11 2 1 78 0 1 2 9 8 7 1 76 88]]
Boolean Masking and Advanced Indexing#
filedata >50
array([[False, False, False, False, True, True, False, False, False,
False, False, False, False, False, False, False, False, False],
[False, False, False, False, True, True, False, True, False,
False, False, False, False, False, False, False, False, False],
[False, False, False, False, True, False, False, False, True,
False, False, False, False, False, False, False, True, True]])
print(filedata)
filedata[filedata >50] # '[]' will display the value of data point from the dataset
[[ 1 13 21 11 196 75 4 3 34 6 7 8 0 1 2 3 4 5]
[ 3 42 12 33 766 75 4 55 6 4 3 4 5 6 7 0 11 12]
[ 1 22 33 11 999 11 2 1 78 0 1 2 9 8 7 1 76 88]]
array([196, 75, 766, 75, 55, 999, 78, 76, 88])
print(filedata)
np.any(filedata > 50, axis = 0) # axis=0 refers to columns and axis=1 refers to rows in this dataset
[[ 1 13 21 11 196 75 4 3 34 6 7 8 0 1 2 3 4 5]
[ 3 42 12 33 766 75 4 55 6 4 3 4 5 6 7 0 11 12]
[ 1 22 33 11 999 11 2 1 78 0 1 2 9 8 7 1 76 88]]
array([False, False, False, False, True, True, False, True, True,
False, False, False, False, False, False, False, True, True])
print(filedata)
np.all(filedata > 50, axis = 0) # '.all' refers to all the data points in row/column (based on axis=0 or axis=1).
[[ 1 13 21 11 196 75 4 3 34 6 7 8 0 1 2 3 4 5]
[ 3 42 12 33 766 75 4 55 6 4 3 4 5 6 7 0 11 12]
[ 1 22 33 11 999 11 2 1 78 0 1 2 9 8 7 1 76 88]]
array([False, False, False, False, True, False, False, False, False,
False, False, False, False, False, False, False, False, False])
print(filedata)
(((filedata > 50) & (filedata < 100)))
[[ 1 13 21 11 196 75 4 3 34 6 7 8 0 1 2 3 4 5]
[ 3 42 12 33 766 75 4 55 6 4 3 4 5 6 7 0 11 12]
[ 1 22 33 11 999 11 2 1 78 0 1 2 9 8 7 1 76 88]]
array([[False, False, False, False, False, True, False, False, False,
False, False, False, False, False, False, False, False, False],
[False, False, False, False, False, True, False, True, False,
False, False, False, False, False, False, False, False, False],
[False, False, False, False, False, False, False, False, True,
False, False, False, False, False, False, False, True, True]])
print(filedata)
(~((filedata > 50) & (filedata < 100))) # '~' means not
[[ 1 13 21 11 196 75 4 3 34 6 7 8 0 1 2 3 4 5]
[ 3 42 12 33 766 75 4 55 6 4 3 4 5 6 7 0 11 12]
[ 1 22 33 11 999 11 2 1 78 0 1 2 9 8 7 1 76 88]]
array([[ True, True, True, True, True, False, True, True, True,
True, True, True, True, True, True, True, True, True],
[ True, True, True, True, True, False, True, False, True,
True, True, True, True, True, True, True, True, True],
[ True, True, True, True, True, True, True, True, False,
True, True, True, True, True, True, True, False, False]])
### You can index with a list in NumPy
a = np.array([1,2,3,4,5,6,7,8,9])
a [[1,2,8]] #indexes
array([2, 3, 9])
Numpy Documentation#
This brief overview has touched on many of the important things that you need to know about numpy, but is far from complete. Check out the numpy reference to find out much more about numpy.