+3 votes
in Programming Languages by (75.0k points)
I have a list of strings. I want to find the index of the string in the list which contains a substring. I am interested in the index of the very first string that contains the substring.


aa = ["hello dhimdhim", "hello chipchip", "bye bye dhundhun", "aye aye tuntun"]

In the above list, if I search for substring "dhundhun", it should return 2.

What Python function should I use?

1 Answer

+1 vote
by (353k points)
selected by
Best answer

You can use the find() function of NumPy.char.  It scans through all elements of the list and returns the lowest index in the string where the substring is found.

Here is an example using your list.

import numpy as np
aa = ["hello dhimdhim", "hello chipchip", "bye bye dhundhun", "aye aye tuntun"]
indices = np.char.find(aa, "hello")
idx = np.where(indices>=0)[0]

# one liner
idx = np.where(np.char.find(aa, "hello")>=0)[0]

The above code will return [0 1] as the first and second elements of the list contain the substring "hello".

The find() function returns a value >=0 if the substring is found, otherwise, it returns -1.

In the above code, the value of the variable indices is [ 0  0 -1 -1] i.e. the lowest index of the substring "hello" in the first two elements is 0 and the 3rd and 4th elements do not have the substring.