This is Interesting: Free Magazines for Graphics designers and webmasters  


Home > Archive > Microsoft Publisher > August 2007 > Converting to UTF-8 character set for international characters





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Converting to UTF-8 character set for international characters
Shannon J.

2007-08-03, 6:16 pm

I hope someone can help me with a problem that has me tearing my hair out. My
web site (www.shannonjimenez.com) has text in both Spanish and English, which
means it has lots of international characters.

At first, in previews, all text appeared great, but when published to the
web, all intl. characters appeared as boxes. I then changed the encoding to
UTF-8, but that did not fix the problem and, when you look at the code,
charset=windows-1252 appears rather than the character set being UTF-8. I
tried a work around of putting text "placeholders" in for each intl.
character and using the "Find in File" program to replace the placeholders
with html code (é, ñ etc.). This makes the page appear fine in IE
(except for a couple em-dashes I forgot to replace), but in Firefox all the
html code is visible on the screen.

So my question is: how can I get Publisher to use the UTF-8 character set
and display my intl. characters correctly? Or is there some other work around
I can use? My site is fairly basic and I would hate to have to buy and learn
an expensive/complicated program like Dreamweaver or Expression Web to
resolve the issue if Publisher can suffice!

Thanks in advance for your help...

~Shannon
Shannon J.

2007-08-03, 6:16 pm

Forgot to mention I'm using Publisher 2007!
Mike Koewler

2007-08-03, 6:16 pm

Shannon,

If you go to Tools, Web Page Options, Web Site Options, under Encoding,
you can select Unicode (UTF-8).

Mike

Shannon J. wrote:
> I hope someone can help me with a problem that has me tearing my hair out. My
> web site (www.shannonjimenez.com) has text in both Spanish and English, which
> means it has lots of international characters.
>
> At first, in previews, all text appeared great, but when published to the
> web, all intl. characters appeared as boxes. I then changed the encoding to
> UTF-8, but that did not fix the problem and, when you look at the code,
> charset=windows-1252 appears rather than the character set being UTF-8. I
> tried a work around of putting text "placeholders" in for each intl.
> character and using the "Find in File" program to replace the placeholders
> with html code (é, ñ etc.). This makes the page appear fine in IE
> (except for a couple em-dashes I forgot to replace), but in Firefox all the
> html code is visible on the screen.
>
> So my question is: how can I get Publisher to use the UTF-8 character set
> and display my intl. characters correctly? Or is there some other work around
> I can use? My site is fairly basic and I would hate to have to buy and learn
> an expensive/complicated program like Dreamweaver or Expression Web to
> resolve the issue if Publisher can suffice!
>
> Thanks in advance for your help...
>
> ~Shannon

Shannon J.

2007-08-03, 6:16 pm

Yes, I already did that. But, despite having done that, my pages still have
the windows-1252 character set and still don't display correctly.

"Mike Koewler" wrote:

> Shannon,
>
> If you go to Tools, Web Page Options, Web Site Options, under Encoding,
> you can select Unicode (UTF-8).
>
> Mike
>
> Shannon J. wrote:
>

Mary Sauer

2007-08-04, 6:19 pm

Is it because you are using "smart quotes"? It seems your "boxes" are appearing
where an apostrophe or quotes are typed. In Publisher, tools, Autocorrect
settings, AutoFormat As You Type tab, clear Straight quotes with smart quotes.

--
Mary Sauer MSFT MVP
http://office.microsoft.com/
http://msauer.mvps.org/
news://msnews.microsoft.com

"Shannon J." <ShannonJ@discussions.microsoft.com> wrote in message
news:2751AB45-3FA5-44F3-899E-49C63180BBBF@microsoft.com...
> Forgot to mention I'm using Publisher 2007!



DavidF

2007-08-04, 6:19 pm

Shannon,

Here is example of a Publisher web in both english and spanish:
http://www.somoscapazes.org/
I notice from this page that the code is UTF-8, and most of the formatting
looks ok.

When you change the encoding to UTF-8 and do a web page preview, look at the
source code. Does it show UTF-8? If so, then the reason your website page
doesn't show this may be that you are still looking at the old page with the
old default encoding. You may need to delete the old files off the server.

As a general rule, you should not be directly editing the code in a
Publisher web page. Even if you are successful, every time you made a change
in that page, you would have to edit the code again, which isn't very
practical. Generally look for a workaround that can be done via the
Publisher page, not changing the code.

When comparing your pages to the other example cited above, I notice that
your text and paragraph formatting is substantially different. In addition
to Mary's suggestion about the quotes, you might try removing the
justification and then all other text box, line spacing, text box spacing
etc. and limit your formatting to just font type and size, and test again.
Also Format > Styles and clear formatting. If this helps then add back in
formatting one element at a time, testing as you go. Some formatting simply
will not translate well to HTML, and you have the added complication of
using nonsupported symbols in Spanish.

DavidF

"Shannon J." <ShannonJ@discussions.microsoft.com> wrote in message
news:053C993D-E28B-4758-A884-0F787F2CF503@microsoft.com...[color=darkred]
> Yes, I already did that. But, despite having done that, my pages still
> have
> the windows-1252 character set and still don't display correctly.
>
> "Mike Koewler" wrote:
>


Sponsored Links


Copyright 2003 - 2008 forum4designers.com  Software forum  Computer Hardware reviews