*blog... kind of... *rss
Optimising Asian fonts for Multi-language flash sites
23 comments written so far...So I'm sure you've done a website that needed to be on 1,238 different languages. And every time you reached Chinese, Japanese, Korean... you got surprised that just the embeded font made your swf 9,000kbytes big.
For this project we're working on I'm mainly doing little tools with PHP. One of them is a translations manager, so you have a little SQL database with all the keywords and languages and someone fills it with data. At any point you can export it as .xml ready to be used in the website.
Having this set up, Theo came up with the idea that, as we had control on the text that was going to be needed for each language, we could do a script to output the list of characters needed for each font.
The PHP script goes down to this:
You'll also need this:
What this code does (properly setted up in yours) is split the whole string into characters and check one by one if has been added to the list of characters used, if it's a new it just adds it. Then it writes a unicode list formated as U+XXXX. The output looks something like this:
What's this for you'll ask. Well, just look at this:
In this way, you're going to import on the .swf only the characters you're using from the .ttf.
In our case, Chinese went down from 9,554kbytes to 45kbytes. That's a 99.6% reduction. Pretty cool!.
Hopefully this will save some sleepless nights to someone.
For this project we're working on I'm mainly doing little tools with PHP. One of them is a translations manager, so you have a little SQL database with all the keywords and languages and someone fills it with data. At any point you can export it as .xml ready to be used in the website.
Having this set up, Theo came up with the idea that, as we had control on the text that was going to be needed for each language, we could do a script to output the list of characters needed for each font.
The PHP script goes down to this:
// In this case $lines is a associative array that comes from MySQL.
$list = array();
foreach($lines as $line)
{
$string = $line["text"];
$string = strip_tags($string);
$string = str_replace('\n','',$string);
preg_match_all('/./u', $string, $chars);
foreach($chars[0] as $char)
{
$found = false;
foreach($list as $listchar)
if ($listchar == $char)
$found = true;
if ($found == false)
$list[] = $char;
}
}
foreach($list as $item)
{
echo "U+" . zeropad( strtoupper( dechex( substr( mb_encode_numericentity ( $item, array (0x0, 0xffff, 0, 0xffff), 'UTF-8'), 2, -1 ) ) ), 4 ) . ",";
}
You'll also need this:
function zeropad($num, $lim)
{
return (strlen($num) >= $lim) ? $num : zeropad("0" . $num, $lim);
}
What this code does (properly setted up in yours) is split the whole string into characters and check one by one if has been added to the list of characters used, if it's a new it just adds it. Then it writes a unicode list formated as U+XXXX. The output looks something like this:
U+0043, U+0048, U+0041, U+004E, U+0045, U+004C, U+002E, U+004F, U+004D, U+0052, U+0044, U+0049, U+0054, U+0053, U+5168, U+5C4F, U+89C2, U+770B, U+5176, U+5B83, U+8BED, U+8A00, U+6CD5, U+5F8B, U+58F0, U+660E, U+97F3, U+91CF, U+5E55, U+540E, U+82B1, U+7D6E, U+5965, U+9EDB, U+4E3D, U+2022, U+5854, U+56FE, U+0020, U+4E0E, U+8BA9, U+002D, U+76AE, U+8036, U+5C14, U+70ED, U+5185, U+62CD, U+6444, U+8BB0, U+5F55, U+73B0, U+573A, U+5F71, U+7247, U+0032, U+5206, U+0030, U+79D2, U+0036, U+00B0, U+0035, U+4F20, U+5947, U+4E3A, U+4EC0, U+4E48, U+9009, U+5851, U+9020, U+5973, U+795E, U+642D, U+4E58, U+591C, U+95F4, U+5217, U+8F66, U+7684, U+4EBA, U+6027, U+611F, U+8BF1, U+60D1, U+4F60, U+6700, U+559C, U+7231, U+955C, U+5934, U+7B2C, U+4E00, U+6B21, U+7EED, U+5199, U+8F89, U+714C, U+4EE3, U+00BA, U+9999, U+6C34, U+6C1B, U+5FC6, U+6211, U+53F7, U+2014, U+79D8, U+6570, U+5B57, U+0039, U+5948, U+513F, U+4E4B, U+5E74, U+7537, U+4E3B, U+89D2, U+5D14, U+7EF4, U+65AF, U+0660, U+8FBE, U+6587, U+6CE2, U+7279, U+8FC7, U+7A0B, U+4E2D, U+7F8E, U+597D, U+56DE, U+5609, U+4F2F, U+8389, U+5212, U+65F6, U+521B, U+4F5C, U+73CD, U+8D35, U+6735, U+539F, U+6599, U+5999, U+8C03, U+548C, U+5242, U+7A7F, U+8D8A, U+5149, U+7ECF, U+5178, U+56DB, U+79CD, U+6F14, U+7ECE
What's this for you'll ask. Well, just look at this:
[Embed(source="yourfont.ttf", fontFamily="YourFont", fontWeight= "bold", fontStyle = "normal",advancedAntiAliasing="true", mimeType="application/x-font-truetype", unicodeRange="U+0043, U+0048, U+0041, U+004E, U+0045, U+004C, U+002E, U+004F, U+004D, U+0052, U+0044, U+0049, U+0054, U+0053, U+5168, U+5C4F, U+89C2, U+770B, U+5176, U+5B83, U+8BED, U+8A00, U+6CD5, U+5F8B, U+58F0, U+660E, U+97F3, U+91CF, U+5E55, U+540E, U+82B1, U+7D6E, U+5965, U+9EDB, U+4E3D, U+2022, U+5854, U+56FE, U+0020, U+4E0E, U+8BA9, U+002D, U+76AE, U+8036, U+5C14, U+70ED, U+5185, U+62CD, U+6444, U+8BB0, U+5F55, U+73B0, U+573A, U+5F71, U+7247, U+0032, U+5206, U+0030, U+79D2, U+0036, U+00B0, U+0035, U+4F20, U+5947, U+4E3A, U+4EC0, U+4E48, U+9009, U+5851, U+9020, U+5973, U+795E, U+642D, U+4E58, U+591C, U+95F4, U+5217, U+8F66, U+7684, U+4EBA, U+6027, U+611F, U+8BF1, U+60D1, U+4F60, U+6700, U+559C, U+7231, U+955C, U+5934, U+7B2C, U+4E00, U+6B21, U+7EED, U+5199, U+8F89, U+714C, U+4EE3, U+00BA, U+9999, U+6C34, U+6C1B, U+5FC6, U+6211, U+53F7, U+2014, U+79D8, U+6570, U+5B57, U+0039, U+5948, U+513F, U+4E4B, U+5E74, U+7537, U+4E3B, U+89D2, U+5D14, U+7EF4, U+65AF, U+0660, U+8FBE, U+6587, U+6CE2, U+7279, U+8FC7, U+7A0B, U+4E2D, U+7F8E, U+597D, U+56DE, U+5609, U+4F2F, U+8389, U+5212, U+65F6, U+521B, U+4F5C, U+73CD, U+8D35, U+6735, U+539F, U+6599, U+5999, U+8C03, U+548C, U+5242, U+7A7F, U+8D8A, U+5149, U+7ECF, U+5178, U+56DB, U+79CD, U+6F14, U+7ECE")] public var FontClass:Class;
In this way, you're going to import on the .swf only the characters you're using from the .ttf.
In our case, Chinese went down from 9,554kbytes to 45kbytes. That's a 99.6% reduction. Pretty cool!.
Hopefully this will save some sleepless nights to someone.
The price of such solution is that it's more complicated to change the text :/
interesting solution anyway :)
interesting solution anyway :)
Yeah. But usually texts doesn't change much at this stage.
The problem (or rather the inefficiency) for such an approach is that you would have to republish the font every time there is a copy change.
My main gripe is that there is really no good solution to this. Apart from the huge size of the non-latin character sets for fonts, I feel that at some level, the font downloading should be something handled on the browser (similar to @font-face in css).
My main gripe is that there is really no good solution to this. Apart from the huge size of the non-latin character sets for fonts, I feel that at some level, the font downloading should be something handled on the browser (similar to @font-face in css).
Any one wanting to learn how to create the font swf in flash CS4, check out this Lee Brimelow tutorial:
http://www.gotoandlearn.com/play?id=102
http://www.gotoandlearn.com/play?id=102
and you're compiling the swf with the embed tag server-wise anytime the text changes?
No no, the text doesn't change. Although having that linked to a online CMS is a interesting idea too.
That's the biggest pain for Chinese site since ever...with as3 it is much easier than b4.For flash project we publish a swf that exports one or several textfield with needed characters embeded for storing fonts and another textfield as a public component for actual using...it may look complicated but can do the magic.And yes,it is complicated too,to change texts and add new chacaters..we've already get used to...
It a cool solution but it not work correctly when texts are dynamics (an admin can enter an new word using other letters).
It's for what i'm working on more complexe other solution :
streaming needed chars (load only chars you need at "t" instant)
Demo :
http://memmie.lenglet.name/documents/lab/fontstream/waterfall_demo.html
Post (only fr at this moment)
http://memmie.lenglet.name/?p=33
No source yet, but release it soon.
For quickly describe it, this use the same hack of sound generation in flash 9 (dynamic generation of SWF file bytecode) including font data (loaded chars) and voilà !
It's for what i'm working on more complexe other solution :
streaming needed chars (load only chars you need at "t" instant)
Demo :
http://memmie.lenglet.name/documents/lab/fontstream/waterfall_demo.html
Post (only fr at this moment)
http://memmie.lenglet.name/?p=33
No source yet, but release it soon.
For quickly describe it, this use the same hack of sound generation in flash 9 (dynamic generation of SWF file bytecode) including font data (loaded chars) and voilà !
That's a really good solution too Mem! I guess you're parsing the .ttf with php or something, otherwise, if you're accessing the .ttf directly we would have problems of making public fonts that don't allow that on their license.
So ideally, the next step would be for the server-side recompile a font swf each time the copy gets changed. Is this possible?
It is not impossible. But I think Mem's approach is much more interesting (streaming the font).
Font licencing can be a problem. I don't known.
PHP read a specific binary file.
This file is generated from a SWF.
Each glyphs binary data is extracted (SHAPE type in SWF file format and more like advance or kernings) and keept the same form inside generated file.
It's roughly the same as SWF data about fonts (DefineFont3) but reoganizated for speed usage and more.
The client receive the same file without not needed chars and so on
PHP read a specific binary file.
This file is generated from a SWF.
Each glyphs binary data is extracted (SHAPE type in SWF file format and more like advance or kernings) and keept the same form inside generated file.
It's roughly the same as SWF data about fonts (DefineFont3) but reoganizated for speed usage and more.
The client receive the same file without not needed chars and so on
interesting.
Hey Mr Doob,
I'm pretty sure to know why you are posting this article :D
Don't worry, the end is coming very soon ;)
I'm pretty sure to know why you are posting this article :D
Don't worry, the end is coming very soon ;)
I wonder who you are samoth... ;D
Uhm... Arabic went in today! Arabic is tricky. Hint: If you think the characters are not being displayed properly, this is the step you're probably forgetting:
http://www.arabicode.com/en/flaraby/swf/
Uhm... Arabic went in today! Arabic is tricky. Hint: If you think the characters are not being displayed properly, this is the step you're probably forgetting:
http://www.arabicode.com/en/flaraby/swf/
Oh! Seems like I know who you're now samoth, "my people" found out ;)
nice solution, might rewrite it in flash.
I've been looking at ways of taking screen shots of non embedded fonts and then smoothing them somehow to emulate the antialiasing of embedded fonts.
Have tried increasing the size of the text, getting a bitmap, blurring it slightly, turning on smoothing and scaling it back down but it still looks very grainy and aliased.
It's annoying that _sans cant be made to look smooth in fp9.
I've been looking at ways of taking screen shots of non embedded fonts and then smoothing them somehow to emulate the antialiasing of embedded fonts.
Have tried increasing the size of the text, getting a bitmap, blurring it slightly, turning on smoothing and scaling it back down but it still looks very grainy and aliased.
It's annoying that _sans cant be made to look smooth in fp9.
lol yeah as easy as a question to A. ;)
@caz
if you have a program like photoshop, then usually you can write it out with the font tool onto a transparent background, resize the file to the size you want, then saving them as .PNG 24bit files, they have alpha which prevents both problems unless fp9 has a horrid .PNG rendering, or if you try to resize them in flash, if that was the case, you could possibly (with time and patience) export a vector version to flash, which will (hopefully) have no problems.
if you have a program like photoshop, then usually you can write it out with the font tool onto a transparent background, resize the file to the size you want, then saving them as .PNG 24bit files, they have alpha which prevents both problems unless fp9 has a horrid .PNG rendering, or if you try to resize them in flash, if that was the case, you could possibly (with time and patience) export a vector version to flash, which will (hopefully) have no problems.
It would be very interesting to combine both your approaches + do some manual labour to get the most common chars nailed down for each language (not applicable for western languages).
You'd scan your content & generate font SWFs based on that, pretty simple stuff that could be integrated in the publish step for your CMS.
If you had text input on your site you would include the common chars & load the rest on the fly from Mems font service.
All this would be wrapped in a special TextField component that knows about what font glyphs are loaded & can initiate the new glyph fetching, so the coder would never even have to worry about it – just get the font service running.
Maybe I've just re-capped what you have said, maybe not. :)
You'd scan your content & generate font SWFs based on that, pretty simple stuff that could be integrated in the publish step for your CMS.
If you had text input on your site you would include the common chars & load the rest on the fly from Mems font service.
All this would be wrapped in a special TextField component that knows about what font glyphs are loaded & can initiate the new glyph fetching, so the coder would never even have to worry about it – just get the font service running.
Maybe I've just re-capped what you have said, maybe not. :)
Hi,
Mem's approach is pretty cool!!
I worked in a similar System as described in this post back in 2005 using swfmill (http://swfmill.org/). Nowadays, AS3 allows to do it in a "more AS way", as mr.doob does. Btw, you can generate the font on the server, you will need the Flex SDK installed on your server and compile the SWF remotely. I have been using this in a few projects. Actually what we did with swfmill was exactly the same:
1) Use CMS to feed your content
2) Fetch unique glyphs from content
3) Compile the SWF remotely
:)
SWFMILL is working in AS2 as well. But MR. doob's approach is much more up-to-date. And, anyway, who wants to use AS2 any longer? ;)
Mem's approach is pretty cool!!
I worked in a similar System as described in this post back in 2005 using swfmill (http://swfmill.org/). Nowadays, AS3 allows to do it in a "more AS way", as mr.doob does. Btw, you can generate the font on the server, you will need the Flex SDK installed on your server and compile the SWF remotely. I have been using this in a few projects. Actually what we did with swfmill was exactly the same:
1) Use CMS to feed your content
2) Fetch unique glyphs from content
3) Compile the SWF remotely
:)
SWFMILL is working in AS2 as well. But MR. doob's approach is much more up-to-date. And, anyway, who wants to use AS2 any longer? ;)
very useful post, really a very good trick! ;)
Have your say!
*profile

traditional id: Ricardo Cabello Miguel
based in: London, UK
more: github, twitter, twitpic, soundcloud and flattr
*post nav
Uniqlo Calendar
r08028 - You Fail (Fallas)
r08028 - Swine Flu
£25 later
Masters of Doom
Optimising Asian fonts for Multi-language flash sites
Receptor - Rhyno
Things I still miss in Ubuntu
Get the amount of files in a folder with Ubuntu
Install debug version of Flash Player plugin in Ubuntu
Minilogue_Old_New.mp3
*latest comments
Deleted myself from Facebook
Deleted myself from Facebook
Deleted myself from Facebook
César Alierta y las anotaciones de Youtube
La profesionalidad de Recreo comunicación
Nathan Fake the Sky Was Pink (edicion Leonardo Martin)
César Alierta y las anotaciones de Youtube
César Alierta y las anotaciones de Youtube
César Alierta y las anotaciones de Youtube
César Alierta y las anotaciones de Youtube
shrinking some embeds.
doobsterelly, with the unicooodes.
lowering fiiilesiiize.
alriiight