For example \n is a newline character inside a python string which will lose its meaning in a raw string and will simply mean backslash followed by n. string.split() string.split() will break and split the string on the argument that is passed and return all the parts in a list. The list will not include the splitting character (s).
To split on whitespace, see How do I split a string into a list of words?. To extract everything before the first delimiter, see Splitting on first occurrence. To extract everything before the last delimiter, see Partition string in Python and get value of last segment after colon.
To split on other delimiters, see Split a string by a delimiter in python. To split into individual characters, see How do I split a string into a list of characters?.
To be fair though I specifically asked for split and then strip () and strip removes leading and trailing whitespace and doesn't touch anything in between. A slight change and your answer would work perfectly, though: mylist = mystring.strip ().split (',') although I don't know if this is particularly efficient.
Using split creates very confusing bugs when sharing files across operating systems. \n in Python represents a Unix line-break (ASCII decimal code 10), independently of the OS where you run it.
Great. If you want alternating tokens and separators, as you usually do, it would of course be better to use \W+. The algorithm behind split also seems to indicate whether your list begins/ends with a token or a separator: if starting with a separator it prepends an empty string as the first element. If ending with a separator it adds an empty string as the last element. Useful.
1: If you're unsure what the first two parameters of .str.split() do, I recommend the docs for the plain Python version of the method. But how do you go from: a column containing two-element lists to: two columns, each containing the respective element of the lists? Well, we need to take a closer look at the .str attribute of a column.
My current Python project will require a lot of string splitting to process incoming packages. Since I will be running it on a pretty slow system, I was wondering what the most efficient way to go ...