Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.6k views
in Technique[技术] by (71.8m points)

python - Convert a string with brackets to numpy array

Description of the problem:

I have an array-like structure in a dataframe column as a string (I read the dataframe from a csv file).

One string element of this column looks like this:

In  [1]: df.iloc[0]['points']    
Out [2]: '[(-0.0426, -0.7231, -0.4207), (0.2116, -0.1733, -0.1013), (...)]'

so it's really an array-like structure, which looks 'ready for numpy' to me.

numpy.fromstring() doesn't help as it doesn't like brackets:
convert string representation of array to numpy array in python

A simple numpy.array() on the string itself, if I copy and paste it in the array() function is returning me a numpy array.
But if I fill the array() function with the variable containing the string like that: np.array(df.iloc[0]['points']) it does not work, giving me a ValueError: could not convert string to float

Convert string to numpy array

The question:

Is there any function to do that in a simple way (without replacing or regex-ing the brackets)?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can use ast.literal_eval before passing to numpy.array:

from ast import literal_eval
import numpy as np

x = '[(-0.0426, -0.7231, -0.4207), (0.2116, -0.1733, -0.1013)]'

res = np.array(literal_eval(x))

print(res)

array([[-0.0426, -0.7231, -0.4207],
       [ 0.2116, -0.1733, -0.1013]])

You can do the equivalent with strings in a Pandas series, but it's not clear if you need to aggregate across rows. If this is the case, you can combine a list of NumPy arrays derived using the above logic.

The docs explain types acceptable to literal_eval:

Safely evaluate an expression node or a string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None.

So we are effectively converting a string to a list of tuples, which np.array can then convert to a NumPy array.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
...