The following solution worked. Instead of using int(x)
, we need to use
len(x.encode('utf-8'))
so the final code is updated as
import io
with io.open(r'C:\Python\Data\somefile.txt','r+') as fp:
bytecolumn = (line.rsplit(None,1)[1] for line in fp)
bytes = ( len(x.encode('utf-8')) for x in bytecolumn if x != '-')
print('Total', sum(bytes))
solved Python – ValueError: invalid literal for int() with base 10: ‘hello’