UNPKG

qminer

Version:

A C++ based data analytics platform for processing large-scale real-time streams containing structured and unstructured data

1,041 lines (990 loc) 8.49 kB
| From svn.tartarus.org/snowball/trunk/website/algorithms/english/stop.txt | This file is distributed under the BSD License. | See http://snowball.tartarus.org/license.php | Also see http://www.opensource.org/licenses/bsd-license.html | - Encoding was converted to UTF-8. | - This notice was added. | - Added 1 letter tokens | - Uncommented common words at the end of the original snowball stopwords | - Added En523 stopwords not present in the original snowball stopwords | - Added URL parts | An English stop word list. Comments begin with vertical bar. Each stop | word is at the start of a line. | All tokens consisting of 1 letter from the alphabet a b c d e f g h i j k l m n o p q r s t u v w x y z | Many of the forms below are quite rare (e.g. "yourselves") but included for | completeness. | PRONOUNS FORMS | 1st person sing i | subject, always in upper case of course me | object my | possessive adjective | the possessive pronoun `mine' is best suppressed, because of the | sense of coal-mine etc. myself | reflexive | 1st person plural we | subject | us | object | care is required here because US = United States. It is usually | safe to remove it if it is in lower case. our | possessive adjective ours | possessive pronoun ourselves | reflexive | second person (archaic `thou' forms not included) you | subject and object your | possessive adjective yours | possessive pronoun yourself | reflexive (singular) yourselves | reflexive (plural) | third person singular he | subject him | object his | possessive adjective and pronoun himself | reflexive she | subject her | object and possessive adjective hers | possessive pronoun herself | reflexive it | subject and object its | possessive adjective itself | reflexive | third person plural they | subject them | object their | possessive adjective theirs | possessive pronoun themselves | reflexive | other forms (demonstratives, interrogatives) what which who whom this that these those | VERB FORMS (using F.R. Palmer's nomenclature) | BE am | 1st person, present is | -s form (3rd person, present) are | present was | 1st person, past were | past be | infinitive been | past participle being | -ing form | HAVE have | simple has | -s form had | past having | -ing form | DO do | simple does | -s form did | past doing | -ing form | The forms below are, I believe, best omitted, because of the significant | homonym forms: | He made a WILL | old tin CAN | merry month of MAY | a smell of MUST | fight the good fight with all thy MIGHT | would, could, should, ought might however be included | | AUXILIARIES | | WILL |will would | | SHALL |shall should | | CAN |can could | | MAY |may |might | | MUST |must | | OUGHT ought | COMPOUND FORMS, increasingly encountered nowadays in 'formal' writing | pronoun + verb i'm you're he's she's it's we're they're i've you've we've they've i'd you'd he'd she'd we'd they'd i'll you'll he'll she'll we'll they'll | verb + negation isn't aren't wasn't weren't hasn't haven't hadn't doesn't don't didn't | auxiliary + negation won't wouldn't shan't shouldn't can't cannot couldn't mustn't | miscellaneous forms let's that's who's what's here's there's when's where's why's how's | rarer forms | daren't needn't | doubtful forms | oughtn't mightn't | ARTICLES a an the | THE REST (Overlap among prepositions, conjunctions, adverbs etc is so | high, that classification is pointless.) and but if or because as until while of at by for with about against between into through during before after above below to from up down in out on off over under again further then once here there when where why how all any both each few more most other some such as no nor not only own same so than too very | the following words are among the commonest in English one every least less many now ever never say says said also get go goes just made make put see seen whether like well back even still way take since nother however two three four five first second new old high long | extra words from En523 secondly consider whoever edu causes seemed whose certainly th sorry sent far cause hereafter try likely appear brief sup respectively let others alone along allows howbeit usually que changes thats hither via useful merely viz everybody use contains next therefore taken thru tell knows becomes hereby everywhere particular known must none oh anywhere nine can following example indicated indicates something want needs rather six instead okay tried may different tries third whenever maybe appreciate specifying allow keeps looking help indeed mainly soon course looks thank thence selves inward actually better willing thanx might non someone somebody thereby several name always reasonably whither went mean everyone eg ex et beyond really furthermore rd re seriously got forth thereupon given quite whereupon besides ask anyhow hereupon keep ltd hence onto think already seeming thereafter awfully done another little accordingly anyone indicate gives mostly exactly took immediate regards somewhat believe specify unfortunately gotten zero toward beforehand unlikely need seem saw clearly relatively thoroughly self able aside thorough towards unless though eight nothing sub don especially noone sometimes definitely normally came saying particularly anyway fifth outside going meanwhile overall truly ones nearly despite regarding qv twice contain thanks ignored namely anyways best wonder away currently please behind various hopefully probably neither across available come last whereafter according somewhere became whole comes otherwise among presumably co afterwards whatever moreover throughout considering sensible described much hardly wants corresponding latterly concerning else former novel look value will near theres seven ve almost wherever thus herein cant vs ie containing etc perhaps insofar nobody wherein beside gets used upon uses kept whereby nevertheless com anybody obviously without latter lest downwards liked greetings followed yes yet unto seems except around possible know using apart necessary follows either become therein right often somehow sure specified happens shall per everything asking provides tends nowhere although entirely ok anything getting whence plus consequently seeing formerly within appropriate inasmuch inner elsewhere enough becoming amongst hi trying wish us placed un gone later associated certain doesn sometime inc uucp whereas nd lately regardless welcome together serious hello | URL parts http https www t.co | twitter url shortener | ccTLDs af al dz ad ao ag ar am au at az bs bh bd bb by be bz bj bt bo ba bw br bn bg bf bi kh cm ca cv cf td cl cn co km cd cg cr ci hr cu cy cz dk dj dm do ec eg sv gq er ee et fj fi fr ga gm ge de gh gr gd gt gn gw gy ht hn hu is in id ir iq ie il it jm jp jo kz ke ki kp kr kw kg la lv lb ls lr ly li lt lu mk mg mw my mv ml mt mh mr mu mx fm md mc mn me yu ma mz mm na nr np nl nz ni ne ng no om pk pw pa pg py pe ph pl pt qa ro ru su rw kn lc vc ws sm st sa sn rs yu sc sl sg sk si sb so za es lk sd sr sz se ch sy tj tz th tl tg to tt tn tr tm tv ug ua ae uk us uy uz vu va ve vn ye zm zw ge tw az nc.tr md so ge au cx cc au hm nf nc pf yt gp pm wf tf pf bv ck nu tk gg im je ai bm io | BCP47 languages (two-letter) aa ab ae af ak am an ar as av ay az ba be bg bh bi bm bn bo br bs ca ce ch co cr cs cu cv cy da de dv dz ee el en eo es et eu fa ff fi fj fo fr fy ga gd gl gn gu gv ha he hi ho hr ht hu hy hz ia id ie ig ii ik in io is it iu iw ja ji jv jw ka kg ki kj kk kl km kn ko kr ks ku kv kw ky la lb lg li ln lo lt lu lv