UNPKG

sbd-fork

Version:

Split text into sentences with Sentence Boundary Detection (SBD).

114 lines (93 loc) 4.57 kB
<!doctype html> <html class="no-js"> <head> <meta charset="utf-8"> <meta http-equiv="x-ua-compatible" content="ie=edge"> <title>Sentence Boundary Detection (SBD).</title> <meta name="description" content="Sentence Boundary Detection with javascript"> <meta name="viewport" content="width=device-width, initial-scale=1"> <link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/css/bootstrap.min.css" rel="stylesheet"> <style> .padpad { padding: 40px 0; } .site-footer { border-top:1px solid #eee; margin-top:60px; text-align: center; } </style> <script src="dist/sbd.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/angular.js/1.3.15/angular.min.js"></script> <script type="text/javascript"> var app = angular.module('sbdApp', []); app.controller('Index', function($scope) { $scope.textContent = "On Jan. 20, former Sen. Barack Obama became the 44th President of the U.S. Millions attended the Inauguration."; }); app.filter('tokenize', function() { return function(input) { return tokenizer.sentences(input); }; }); app.filter('pluralize', function() { return function(input) { var len = tokenizer.sentences(input).length || 0; if (len > 1) { return len + " sentences"; } else if (len === 1) { return "1 sentence"; } return "No sentences"; }; }); </script> </head> <body ng-app='sbdApp' ng-controller='Index'> <!--[if lt IE 8]> <p class="browserupgrade">You are using an <strong>outdated</strong> browser. Please <a href="http://browsehappy.com/">upgrade your browser</a> to improve your experience.</p> <![endif]--> <a href="https://github.com/Tessmore/sbd"> <img style="position: absolute; top: 0; right: 0; border: 0;" src="https://camo.githubusercontent.com/38ef81f8aca64bb9a64448d0d70f1308ef5341ab/68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f6769746875622f726962626f6e732f666f726b6d655f72696768745f6461726b626c75655f3132313632312e706e67" alt="Fork me on GitHub" data-canonical-src="https://s3.amazonaws.com/github/ribbons/forkme_right_darkblue_121621.png"> </a> <div class="container"> <div class="row"> <div class="col-md-12"> <h1 class="page-header">Sentence Boundary Detection (SBD).</h1> <p class="lead">Split text into sentences with a `vanilla` rule based approach (i.e working ~95% of the time).</p> <ul> <li>Split a text based on period, question- and exclamation marks.</li> <li>Skips (most) abbreviations (Mr., Mrs., PhD.)</li> <li>Skips numbers/currency</li> <li>Skips urls, websites, email addresses, phone nr.</li> <li>Counts ellipsis and ?! as single punctuation</li> </ul> </div> </div> <div class="row padpad"> <div class="col-md-6"> <div class="form-group"> <label>Type your text here:</label> <textarea class="form-control" ng-model='textContent' rows='12' style='width:100%'></textarea> </div> </div> <div class="col"> <div class="form-group"> <label>Your results will appear here:</label> <div> <pre id="results">{{textContent | tokenize | json}}</pre> <div class="help-block">{{textContent | pluralize }}</div> </div> </div> </div> </div> <div class="padpad site-footer"> <span class="site-footer-owner"> <a href="https://github.com/Tessmore/sbd">Sbd</a> is maintained by <a href="https://github.com/Tessmore">Tessmore</a>. </span> <span class="site-footer-credits">This page was created by <a href="https://github.com/fpereira1">Filype Pereira (@fpereira1)</a></span> </div> </div> <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.3/jquery.min.js"></script> </body> </html>