Count the frequency of characters at a position in a string in a Pandas DataFrame column?
Take this example DataFrame:
fake_data = {'columnA': ['XAVY', 'XAVY', 'XAVY', 'XAVY', 'XAVY', 'AXYV', 'AXYV', 'AXYV', 'AXYV', 'AXYV', 'AXYV']}
df = pd.DataFrame(fake_data, columns = ['columnA'])
df
Robert Niro
Maybe this helps:
new_data = fake_data.columnA.str.split('',n=4, expand=True).drop(0, axis=1)
stats = new_data.apply(pd.Series.value_counts)
stats = stats.apply(lambda x: (x/x.sum())*100).round(2).fillna(0)
print(stats)
Output
1 2 3 4
A 54.54 45.45 0 0
V 0 0 45.45 54.54
X 45.45 54.54 0 0
Y 0 0 54.54 45.45