webOS Nation Forums
> webOS apps and software
> webOS development
>
Autoreplace Upgrade
1 2
3 4 5 6 7
...
Last
Member:
dimfeld
at: 02:10 AM 06/23/2009
Ok, so I took the 15,000 most common words list I posted earlier today, added contractions and some common acronyms, and manually converted spellings to American English by searching for "ou" and changing it to "u" where appropriate.
Here's the link:
drop.io dimfeld
This autoreplace file comes in at 347KB, 1/4 the size of the previous file from the Unix word list with max word length of 5, and hopefully it will be more useful.
I'm sure there are more words to be added, but this should be a better starting point than what I provided earlier.
EDIT: Updated with more British->American conversions.
EDIT2: I just realized that this was still using a max word length of 5. With a max word length of 7 I get a 1.2 MB file, down 200KB from the previous list.
the "google corpus" from 2003 is an intereting list to use for this purpose.
The top 1000 words from there would be "webby"
hxxp://blogoscoped.com/archive/2003_11_03_index.html
Member:
dimfeld
at: 02:19 AM 06/23/2009
Ah, I didn't know Google had done something like this too. I'll take a look. Thanks for the link!
EDIT:
Ok, I took the first 15,000 words from the Google corpus and ran them through a spell checker, then added a few internet acronyms and other stuff.
I've reuploaded the zip file with three different autoreplace files now. All are using a max word length of 7.
text-edit-autoreplace-engcorpus is the file from the English corpus with my modifications.
text-edit-autoreplace-goog is generated from the Google corpus
text-edit-autoreplace-comb is generated from the two corpura merged.
The first two are about 1.2MB each, and the combined file is 1.7MB. I'm going to try using the Google corpus version on my Pre for now and see how that works out.
Member:
dimfeld
at: 03:47 AM 06/23/2009
I just noticed that corrections that aren't in all lowercase are ignored, although you can have the result of a correction be in uppercase, such as id -> I'd. I've modified the script to make all misspellings lowercase, and I've reuploaded the new autocorrect files and script, still at
drop.io dimfeld
Member:
sacherjj
at: 08:45 AM 06/23/2009
That sounds like a much more manageable sized file.
Thanks OP you guys are rocking!
Member:
nimer55
at: 01:28 PM 06/23/2009
Originally Posted by Shane112358:
Thanks OP you guys are rocking!
Agreed....
by the time the pre comes out in canada it will be so ballen.
Dimfeld, can I post your research and results to the predev wiki?
It's ___really___ nice of you to have done this.
Rick
Member:
dimfeld
at: 02:57 PM 06/23/2009
Yeah, that's fine with me!
ok, on the way.
Dimfeld, please review this wiki post... hxxp://predev.wikidot.com/add-words-to-auto-correct-dictionary
Member:
dimfeld
at: 04:07 PM 06/23/2009
Looks good, thanks! The one correction I have is that the 344K size was still with a maximum word length of 5 letters. The files I'm using now and that are in the link on the Wiki have a maximum word length of 7 letters. The ones based on the Google corpus and the British National corpus are about 1.2MB each, and the one based on the two combined is about 1.7MB.
I do still have the 344KB file posted at
drop.io dimfeld, though this doesn't have the capitalization fixes that I mentioned in post #24. But using the script from the newer ZIP file with a maximum word length of 5 will produce a ~344KB file that handles capital letters correctly. I'll also update the existing 5-character word ZIP file when I get home late tonight.
EDIT: Updated the 344KB file. Still at the same link. With some other changes I made to the script it's now 400KB, but that should still be fine if you're looking for a very small file.
fantastic work, i think im going to take the google list, add in a few vulgar words, and I'll be good to go
Member:
jngai
at: 12:11 AM 06/24/2009
Thanks. Going to try out the new list now. This quells my inner complaint that the iPhone has more autocorrections than the Pre.
Member:
dimfeld
at: 01:28 AM 06/24/2009
I've updated the ZIP file again. Here's what's new:
Words with capital letters now have a correction added so that when typed correctly in lowercase, the word will get the correct capitalization. Examples:
fbi -> FBI
facebook -> Facebook
I've added an "extra words" list for words that I want to always add regardless of which master word list I'm using. Words from this list are always included, even if they're longer than the maximum word length.
I also added a README to the zip file that describes a bit about how to use the script.
And here's the link again for good measure:
http://drop.io/dimfeld/asset/autoreplace-common-zip#
marathon, eh?
Your extra words are way more socially acceptable than mine :>
How does the 1.2MB file perform on the Pre? Does it still have the sluggish issues with the web browser? Or has this been fixed with the latest adjustments?
- Garrett
Member:
dimfeld
at: 02:37 AM 06/24/2009
It takes about 3 or 4 seconds to load webkit with the 1.2MB file. I don't remember how long it was with the original autoreplace file, but I think it's still slower. That said, I'm satisfied with the speed, so it's really just a matter of preference.
Member:
optik678
at: 06:39 AM 06/24/2009
The most useful mod besides rooting! Thanks OP
Member:
wprater
at: 02:33 PM 06/25/2009
"Pre" gets replaces with "ore" NO
1 2
3 4 5 6 7
...
Last
webOS Nation Forums
> webOS apps and software
> webOS development
>
Autoreplace Upgrade