

Here is how you do in python text = text.encode(‘utf-8’)īut wait you need to strip out extra escape characters to do string operations. To convert unicode to ascii one has to encode unicode strings to utf-8 (For a history on unicode read a detailed article) So to handle unicode strings as regular ascii strings one has to convert unicode strings to ascii. Strings with accented or special characters are unicode strings while regular one’s ascii.

If you have hit similar challenge read on for the solution.

Here is how the array looked like strs = To clear my doubts, I went about testing it on different applications with different inputs, until I hit a road block where the page was returning a mixture of accented strings, strings containing special characters and regular ascii strings. (I would like to know a better solution using css selectors esp., if you have one!!)Īs the solution was so easy I found it difficult to believe that the code had handled all the edge cases. This would extract all the text attributes on the page which was a good enough solution for me. So I used the reliable xpath to parse strings in an android application page. Since I was supposed to write a generic library to parse all strings on page, I didnot have the luxury of using ids for specific control/component on page. Other day I had a challenge to parse all strings on page for a generic automation library I am writing. These days I am involved with web/mobile automation. String operations on string array containing strings with accented &/or special characters alongside regular ascii strings can be quiet an annoyance How to handle accented & special character strings in Python.
