+4 votes
in Programming Languages by (73.8k points)

The dataframe contains alphabets as values. I want to convert alphabets to the numbers representing their Unicode codes. How can I convert them?

E.g. The dataframe looks like this:

   x  y  z

0  a  f  k

1  b  g  l

2  c  h  m

3  d  i  n

4  e  j  o

1 Answer

+1 vote
by (348k points)
selected by
Best answer

The ord() function can be used to get the Unicode value of a specified character. e.g. ord("a")=97. However, you cannot apply the ord() function to the whole dataframe; you need to select one row/column at a time for the coversion.

Here is an example to show how to apply the ord() function on a dataframe. In this code, I am selecting one column at a time.

import pandas as pd
df = pd.DataFrame({'x': ['a', 'b', 'c', 'd', 'e'], 'y': ['f', 'g', 'h', 'i', 'j'], 'z': ['k', 'l', 'm', 'n', 'o']})
for c in df.columns:
    df[c] = df[c].apply(ord)

The above code will convert alphabets to their corresponding Unicode values and will print the following output:

     x    y    z
0   97  102  107
1   98  103  108
2   99  104  109
3  100  105  110
4  101  106  111