Page 2 of 9 FirstFirst 1234567 ... LastLast
Results 21 to 40 of 161
  1.    #21  
    Ok, so I took the 15,000 most common words list I posted earlier today, added contractions and some common acronyms, and manually converted spellings to American English by searching for "ou" and changing it to "u" where appropriate.

    Here's the link: dimfeld

    This autoreplace file comes in at 347KB, 1/4 the size of the previous file from the Unix word list with max word length of 5, and hopefully it will be more useful.
    I'm sure there are more words to be added, but this should be a better starting point than what I provided earlier.

    EDIT: Updated with more British->American conversions.
    EDIT2: I just realized that this was still using a max word length of 5. With a max word length of 7 I get a 1.2 MB file, down 200KB from the previous list.
    Last edited by dimfeld; 06/23/2009 at 03:14 AM.
  2. #22  
    the "google corpus" from 2003 is an intereting list to use for this purpose.

    The top 1000 words from there would be "webby"

  3.    #23  
    Ah, I didn't know Google had done something like this too. I'll take a look. Thanks for the link!

    Ok, I took the first 15,000 words from the Google corpus and ran them through a spell checker, then added a few internet acronyms and other stuff.

    I've reuploaded the zip file with three different autoreplace files now. All are using a max word length of 7.
    text-edit-autoreplace-engcorpus is the file from the English corpus with my modifications.
    text-edit-autoreplace-goog is generated from the Google corpus
    text-edit-autoreplace-comb is generated from the two corpura merged.

    The first two are about 1.2MB each, and the combined file is 1.7MB. I'm going to try using the Google corpus version on my Pre for now and see how that works out.
    Last edited by dimfeld; 06/23/2009 at 03:33 AM.
  4.    #24  
    I just noticed that corrections that aren't in all lowercase are ignored, although you can have the result of a correction be in uppercase, such as id -> I'd. I've modified the script to make all misspellings lowercase, and I've reuploaded the new autocorrect files and script, still at dimfeld
    Last edited by dimfeld; 06/23/2009 at 03:58 AM.
  5. #25  
    That sounds like a much more manageable sized file.
  6. #26  
    Thanks OP you guys are rocking!
  7. #27  
    Quote Originally Posted by Shane112358 View Post
    Thanks OP you guys are rocking!

    by the time the pre comes out in canada it will be so ballen.
  8. #28  
    Dimfeld, can I post your research and results to the predev wiki?

    It's ___really___ nice of you to have done this.

  9.    #29  
    Yeah, that's fine with me!
  10. #30  
    ok, on the way.
  11. #31  
    Dimfeld, please review this wiki post... hxxp://
  12.    #32  
    Looks good, thanks! The one correction I have is that the 344K size was still with a maximum word length of 5 letters. The files I'm using now and that are in the link on the Wiki have a maximum word length of 7 letters. The ones based on the Google corpus and the British National corpus are about 1.2MB each, and the one based on the two combined is about 1.7MB.

    I do still have the 344KB file posted at dimfeld, though this doesn't have the capitalization fixes that I mentioned in post #24. But using the script from the newer ZIP file with a maximum word length of 5 will produce a ~344KB file that handles capital letters correctly. I'll also update the existing 5-character word ZIP file when I get home late tonight.

    EDIT: Updated the 344KB file. Still at the same link. With some other changes I made to the script it's now 400KB, but that should still be fine if you're looking for a very small file.
    Last edited by dimfeld; 06/24/2009 at 04:14 PM.
  13. #33  
    fantastic work, i think im going to take the google list, add in a few vulgar words, and I'll be good to go
    Just waiting for the day my Pre has contacts grouping, and a "speed dial" for text messaging
  14. jngai's Avatar
    46 Posts
    Global Posts
    68 Global Posts
    Thanks. Going to try out the new list now. This quells my inner complaint that the iPhone has more autocorrections than the Pre.
  15.    #35  
    I've updated the ZIP file again. Here's what's new:

    Words with capital letters now have a correction added so that when typed correctly in lowercase, the word will get the correct capitalization. Examples:
    fbi -> FBI
    facebook -> Facebook

    I've added an "extra words" list for words that I want to always add regardless of which master word list I'm using. Words from this list are always included, even if they're longer than the maximum word length.

    I also added a README to the zip file that describes a bit about how to use the script.

    And here's the link again for good measure:
  16. #36  
    marathon, eh?

    Your extra words are way more socially acceptable than mine :>
    Just waiting for the day my Pre has contacts grouping, and a "speed dial" for text messaging
  17. #37  
    How does the 1.2MB file perform on the Pre? Does it still have the sluggish issues with the web browser? Or has this been fixed with the latest adjustments?

    - Garrett
  18.    #38  
    It takes about 3 or 4 seconds to load webkit with the 1.2MB file. I don't remember how long it was with the original autoreplace file, but I think it's still slower. That said, I'm satisfied with the speed, so it's really just a matter of preference.
  19. #39  
    The most useful mod besides rooting! Thanks OP
  20. wprater's Avatar
    240 Posts
    Global Posts
    251 Global Posts
    "Pre" gets replaces with "ore" NO
Page 2 of 9 FirstFirst 1234567 ... LastLast

Tags for this Thread

Posting Permissions