![]() | |
| | #1 |
![]() Join Date: Sep 2006 Location: Your Mom
Posts: 2,241
![]() | eTail - Text Compression Algorithm? While working on the idea for another text compression algorithm, the idea for this one hit me. You should know A character on windows, or a letter ( such as 'a', 'A', '1', '$' ) takes up 1 byte. There are 256 different values in a byte, which is 8 bits ( binary ), so 2^8 = 256. 10001100 - 256 combinations Most of these characters are almost never used, like % and ones I don't even know how to type. So, by using a smaller character set by default, and then extra bits for when a rare character is used, logically you should be able to save space. 00000 = a 00001 = b 00010 = c ... 11111 = *other* + 10101 = $ So a common character would take 5 bits, and a rare one would take 10, instead of every taking 8 ( an estimate ). This is where it gets ugly My idea is to have a separate character set for every NEXT character based on the CURRENT character. If you look at the statistics for commonly used characters in HTML, you will notice there is always a steep slope on probability of the next character. Probability of next character for 'a' in HTML
Probability of next character for '<' in HTML
So, logically by the laws of probability, this can be used to save data. As most, in terms of percentage/bit look like this.
The algorithm is also very adaptive, and by just feeding it examples, it can work much better for those kinds. It can also be much faster in comparison to current compression algorithms, such as gZip, as it doesn't rely on guesswork. ------------------------------------ Anyone follow? Anyone heard of something similar? Last edited by MrApples; 08-05-2008 at 11:19 AM. |
| | |
| | #2 |
![]() Join Date: Mar 2007 Location: Underground
Posts: 5,862
![]() ![]() ![]() | Re: eTail - Text Compression Algorithm? that is cool! I've never heard of anything like it! MrApples, your genius is showing... ![]() And now, directly from chip's signature... ![]() |
| | |
| | #3 |
![]() Join Date: Jun 2008 Location: Northern Ireland
Posts: 840
![]() ![]() ![]() ![]() | Re: eTail - Text Compression Algorithm? I haven't heard anything like it, and I don't really understand it all, I understand up to the point of the next character probability thing. I understand what you mean about making some of 5-bit and others 10-bit. What I find hard to understand though, is what the whole probability thing would do. The World Ends With You! |
| | |
| | #4 | |
![]() Join Date: Sep 2006 Location: Your Mom
Posts: 2,241
![]() | Re: eTail - Text Compression Algorithm? Quote:
| |
| | |
![]() |
| Thread Tools | |
| |
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Bug Report Text bugs | Halb | Bug Report Board | 1 | 04-02-2008 03:22 PM |
| New Map Released (Neo LT v2.1) (17.10.07) | night_wolveX | News | 0 | 01-14-2008 05:20 PM |
| Floating Text | DarkBlade | World Editor Help | 3 | 12-19-2007 03:58 PM |
| Unit editor basic guide | Halakbalakbalak | Submit a Tutorial [World Editor] | 1 | 10-19-2007 02:18 PM |
| Perma-flaoting text... | Halakbalakbalak | World Editor Help | 5 | 10-13-2007 02:06 PM |