strip()
can remove \r\n
only at the end of string, but not inside. If you have \r\n
inside text then use text = text.replace(\r\n', '')
it seems you get \r\n
in list created by extract()
so you have to use list comprehension to remove from every element on list
data = response.css(find).extract()
data = [x.replace('\r\n', '').strip() for x in data]
items[name] = data
EDIT: to remove spaces and \r\n
between sentences you can split('\r\n')
to create list with sentences. then you can strip()
every sentence. And you can ' '.join()
all sentences back to one string.
text="Sentence 1\r\n Sentence 2"
data = text.split('\r\n')
data = [x.strip() for x in data]
text=" ".join(data)
print(text)
The same in one line
text="Sentence 1\r\n Sentence 2"
text=" ".join(x.strip() for x in text.split('\r\n'))
print(text)
The same with module re
import re
text="Sentence 1\r\n Sentence 2"
text = re.sub('\r\n\s+', ' ', text)
print(text)
for name, find in zip(names.values(), finder.values()):
data = response.css(find.strip()).extract()
data = [re.sub('\r\n\s+', ' ', text) for text in data]
items[name] = data
5
solved How to remove \r\n in command prompt after running?