Lefora Free Forum
Loading
566 views

replace "strange" characters

Page 1
1–3
regular - member
54 posts

Only strange because I do not know how to handle them. In processing some html I have come across some characters that I do not know how to handle. I get Ö and it should be presented as Ö, which I believe is Danish. There are others but i think the same solution is in order.

I should note that these characters are appearing in a larger string.

Thank you so much,
Todd

?
288 posts

Todd

There's a spec for "Special Characters" in HTML.  © = the copyright symbol.  

  &#x...; I think is hex.  So you could say it's as easy as &x41;&x42;&x43;  (ABC).  hex D6 is indeed ö = Ö.  I worked in Finland for 7 years for a Jaakko Pöyry Oy.

There is a method in NSMutableString replaceOccurrencesOfString:withString:options:range: which can probably deal with this in a "kind of" OK way.  However if you've got a lot of this to do, I suspect you might need an HTML parser to "normalize" your string.

There's a really great feature in Obj/C called 'category' and this enables you to add methods to existing classes (about the same as the prototype feature in JavaScript).  So you could add a method initWithHTMLstring to NSString and he'd parse for these methods.  Goodness, this is so useful, you might find it's already provided in WebKit.

__________________
regular - member
54 posts

I would think it is already provided, I just don't know what to look for. It seems to me it almost everyone could use it, I know there is a bit for replacing spaces with % for urls but have no idea how to handle foreign languages. Would it be an encoding as UTF-8 or something. Change the encoding and the "hex" will get tanslates to it's correct character?

Thanks again,
                         Todd

Page 1
1–3

Locked Topic


You must be a member to post in this forum

Join Now!