netscape-bookmark-parser
Version:
A TypeScript/JavaScript library for parsing browser bookmark files (HTML format) and manipulating them as structured data. Compatible with both Deno and Node.js runtimes.
502 lines (366 loc) • 15.7 kB
Markdown
# Netscape Bookmark Parser
A TypeScript/JavaScript library for parsing browser bookmark files (HTML format) and manipulating them as structured data. Compatible with both Deno and Node.js runtimes.
> **Note:**
> This README is an AI-generated document. Details are mostly accurate but might include inaccurate description.
> 日本語ドキュメント: [`./README-ja.md`](./README-ja.md)
## Features
- **HTML Bookmark File Parsing**: Parse exported HTML bookmark files from Chrome, Firefox, Safari, and other browsers
- **Hierarchical Structure Preservation**: Completely preserve the hierarchical structure of folders and bookmarks
- **Bidirectional Conversion**: Support conversion from HTML to data structure and vice versa
- **JSON Serialization**: Save and restore bookmark trees as JSON
- **Deno & Node.js Support**: Works with both Deno and Node.js runtimes
- **Type Safety**: Full TypeScript support with comprehensive type definitions
## Installation
### Node.js/npm
```bash
npm install netscape-bookmark-parser
```
```typescript
import { BookmarksParser, BookmarksTree } from "netscape-bookmark-parser";
```
### Deno
```typescript
import {
BookmarksParser,
BookmarksTree,
} from "jsr:@grakeice/netscape-bookmark-parser";
```
> **Note:** The JSR version only includes the Node.js/Deno runtime. For browser support, please use the npm package.
### Browser
#### Option 1: Using Build Tools (Recommended)
**Webpack, Vite, Rollup, Parcel, etc.:**
```typescript
// For browser environments, use the web-optimized version
import { BookmarksParser, BookmarksTree } from "netscape-bookmark-parser/web";
// Example: Parse uploaded bookmark file
function handleFileUpload(event: Event) {
const file = (event.target as HTMLInputElement).files?.[0];
if (file) {
const reader = new FileReader();
reader.onload = (e) => {
const htmlContent = e.target?.result as string;
const bookmarksTree = BookmarksParser.parse(htmlContent);
console.log(bookmarksTree.toJSON());
};
reader.readAsText(file);
}
}
```
#### Option 2: CDN with Import Maps
```html
<script type="importmap">
{
"imports": {
"netscape-bookmark-parser/web": "https://cdn.jsdelivr.net/npm/netscape-bookmark-parser@1.1.4/esm/mod_web.js"
}
}
</script>
<script type="module">
import { BookmarksParser, BookmarksTree } from "netscape-bookmark-parser/web";
const tree = BookmarksParser.parse(htmlContent);
</script>
```
#### Option 3: Direct CDN Import
```html
<script type="module">
import {
BookmarksParser,
BookmarksTree,
} from "https://cdn.jsdelivr.net/npm/netscape-bookmark-parser@1.1.4/esm/mod_web.js";
</script>
```
> **Browser Support:** Browser compatibility is only available through the npm package. The JSR package does not include the web-optimized version due to platform-specific dependencies.
> **Note:** The web-optimized version uses native browser APIs (DOMParser, etc.) and does not include Node.js polyfills, making it lighter and faster in browser environments.
## Usage
### Basic Example
```typescript
import { BookmarksParser, BookmarksTree } from "netscape-bookmark-parser";
// Read HTML bookmark file
const htmlContent = `<!DOCTYPE NETSCAPE-Bookmark-file-1>
<HTML>
<BODY>
<DL><p>
<DT><H3>Folder 1</H3>
<DL><p>
<DT><A HREF="https://example.com">Example</A>
</DL><p>
<DT><A HREF="https://google.com">Google</A>
</DL><p>
</BODY>
</HTML>`;
// Parse HTML and convert to BookmarksTree
const bookmarksTree = BookmarksParser.parse(htmlContent);
// Output as JSON
console.log(JSON.stringify(bookmarksTree.toJSON(), null, 2));
// Convert back to HTML
const htmlDocument = bookmarksTree.toDOM();
console.log(bookmarksTree.HTMLText);
```
### BookmarksTree Operations
```typescript
// Create a new bookmark tree
const tree = new BookmarksTree();
// Add bookmarks
tree.set("Google", "https://google.com");
tree.set("GitHub", "https://github.com");
// Create a folder structure
const devFolder = new BookmarksTree();
devFolder.set("MDN", "https://developer.mozilla.org");
devFolder.set("Stack Overflow", "https://stackoverflow.com");
const toolsFolder = new BookmarksTree();
toolsFolder.set("GitHub", "https://github.com");
toolsFolder.set("VS Code", "https://code.visualstudio.com");
devFolder.set("Tools", toolsFolder);
tree.set("Development", devFolder);
// Convert to JSON
const json = tree.toJSON();
// Restore from JSON
const restoredTree = BookmarksTree.fromJSON(json);
// Check tree structure
console.log(tree.size); // Number of top-level items
console.log(tree.has("Development")); // true
console.log(tree.get("Development") instanceof BookmarksTree); // true
```
### Working with Complex Structures
```typescript
// Parse a complex bookmark file
const complexHtml = `<!DOCTYPE NETSCAPE-Bookmark-file-1>
<HTML>
<BODY>
<DL><p>
<DT><H3>Work</H3>
<DL><p>
<DT><H3>Development</H3>
<DL><p>
<DT><A HREF="https://github.com">GitHub</A>
<DT><A HREF="https://stackoverflow.com">Stack Overflow</A>
</DL><p>
<DT><A HREF="https://docs.google.com">Google Docs</A>
</DL><p>
<DT><H3>Personal</H3>
<DL><p>
<DT><A HREF="https://youtube.com">YouTube</A>
<DT><A HREF="https://twitter.com">Twitter</A>
</DL><p>
<DT><A HREF="https://google.com">Google</A>
</DL><p>
</BODY>
</HTML>`;
const tree = BookmarksParser.parse(complexHtml);
// Navigate the tree structure
const workFolder = tree.get("Work") as BookmarksTree;
const devFolder = workFolder.get("Development") as BookmarksTree;
console.log(devFolder.get("GitHub")); // "https://github.com"
```
## API Reference
### Web-Optimized Version
> **Important:** Browser support is only available via npm installation. JSR version does not include browser-compatible builds.
The library provides a browser-optimized version that eliminates Node.js dependencies and uses native browser APIs:
```typescript
// Import browser-optimized version (npm only)
import { BookmarksParser, BookmarksTree } from "netscape-bookmark-parser/web";
```
> **Note:** The web-optimized version uses native browser APIs (DOMParser, etc.) and does not include Node.js polyfills, making it lighter and faster in browser environments.
### BookmarksParser Class
`BookmarksParser` provides static methods to parse Netscape Bookmark format HTML or JSON and convert them into `BookmarksTree` instances.
#### Static Methods
- [`static parse(htmlString: string): BookmarksTree`](#static-parsehtmlstring-string-bookmarkstree)
- Parses a Netscape Bookmark format HTML string and returns a `BookmarksTree`.
- Alias for [`parseFromHTMLString`](#static-parsefromhtmlstringhtmlstring-string-bookmarkstree).
**Example:**
```typescript
const html = `<!DOCTYPE NETSCAPE-Bookmark-file-1>\n<HTML><BODY><DL><p>\n <DT><A HREF=\"https://example.com\">Example</A>\n</DL><p></BODY></HTML>`;
const tree = BookmarksParser.parse(html);
console.log(tree.toJSON());
```
- [`static parseFromHTMLString(htmlString: string): BookmarksTree`](#static-parsefromhtmlstringhtmlstring-string-bookmarkstree)
- Parses a Netscape Bookmark format HTML string and returns a `BookmarksTree`.
**Example:**
```typescript
const html = `<!DOCTYPE NETSCAPE-Bookmark-file-1>...`;
const tree = BookmarksParser.parseFromHTMLString(html);
```
- [`static parseFromDOM(dom: HTMLDocument): BookmarksTree`](#static-parsefromdomdom-htmldocument-bookmarkstree)
- Converts an existing `HTMLDocument` to a `BookmarksTree`.
**Example:**
```typescript
const dom = new DOMParser().parseFromString(html, "text/html");
const tree = BookmarksParser.parseFromDOM(dom);
```
- [`static parseFromJSON(jsonString: string): BookmarksTree`](#static-parsefromjsonjsonstring-string-bookmarkstree)
- Parses a JSON string and returns a `BookmarksTree`.
**Example:**
```typescript
const json = '{"Google": "https://google.com"}';
const tree = BookmarksParser.parseFromJSON(json);
```
---
### BookmarksTree Class
`BookmarksTree` extends `Map` and manages folders (`BookmarksTree`) and bookmarks (URL strings) in a hierarchical structure.
#### Constructor
- [`new BookmarksTree()`](#constructor)
**Example:**
```typescript
const tree = new BookmarksTree();
tree.set("Google", "https://google.com");
```
#### Instance Methods
- [`toJSON(): Record<string, unknown>`](#tojson-recordstring-unknown)
- Converts the tree to a JSON object representation.
**Example:**
```typescript
const json = tree.toJSON();
console.log(json);
```
- [`toDOM(): HTMLDocument`](#todom-htmldocument)
- Converts the tree to a Netscape Bookmark format HTML document.
**Example:**
```typescript
const dom = tree.toDOM();
```
- [`get HTMLString(): string`](#get-htmlstring-string)
- Gets the complete HTML string in Netscape Bookmark format.
**Example:**
```typescript
const html = tree.HTMLString;
console.log(html);
```
- [`get HTMLText(): string`](#get-htmltext-string)
- Alias for `HTMLString` (deprecated).
#### Static Methods
- [`static fromJSON(json: Record<string, unknown>): BookmarksTree`](#static-fromjsonjson-recordstring-unknown-bookmarkstree)
- Creates a tree from a JSON object.
**Example:**
```typescript
const json = { Google: "https://google.com" };
const tree = BookmarksTree.fromJSON(json);
```
- [`static fromDOM(dom: HTMLDocument): BookmarksTree`](#static-fromdomdom-htmldocument-bookmarkstree)
- Creates a tree from an HTML document.
**Example:**
```typescript
const dom = new DOMParser().parseFromString(html, "text/html");
const tree = BookmarksTree.fromDOM(dom);
```
## Project Structure
```
src/
├── BookmarksTree/
│ ├── BookmarksTree.ts # Main bookmark tree class
│ ├── BookmarksTree.test.ts # Comprehensive tests
│ └── index.ts # Export definitions
└── BookmarksParser/
├── BookmarksParser.ts # HTML parser
├── BookmarksParser.test.ts # Parser tests
└── index.ts # Export definitions
scripts/
└── build_npm.ts # NPM build script
.github/
└── workflows/
└── release.yml # CI/CD pipeline
npm/ # Node.js build artifacts
├── esm/ # ES modules
├── package.json
└── README.md
```
## Supported Formats
### Input Formats
- **Netscape Bookmark File Format**: Standard HTML bookmark file format
- **Chrome Export Format**: HTML files exported from Chrome browser
- **Firefox Export Format**: HTML files exported from Firefox browser
- **Safari Export Format**: HTML files exported from Safari browser
- **Edge Export Format**: HTML files exported from Microsoft Edge
- **Generic HTML**: Any HTML following the Netscape bookmark structure
### Output Formats
- **JSON**: Structured JSON data for easy programmatic access
- **HTML**: Standard Netscape Bookmark File Format compatible with all browsers
- **DOM**: Browser-compatible HTMLDocument objects
### Special Features
- **Unicode Support**: Full support for international characters and emojis
- **URL Validation**: Handles various URL formats (http, https, ftp, file, relative)
- **HTML Entity Handling**: Proper escaping and unescaping of HTML entities
- **Empty Folder Support**: Preserves empty folders in the bookmark structure
- **Duplicate Handling**: Last value wins for duplicate bookmark names
## Dependencies
- [`@b-fuze/deno-dom`](https://jsr.io/@b-fuze/deno-dom): DOM parser and manipulation for both Deno and Node.js environments
## Browser Compatibility
### Supported Browsers for Export Files
- **Chrome/Chromium**: All versions
- **Firefox**: All versions
- **Safari**: All versions
- **Microsoft Edge**: All versions
- **Opera**: All versions
- **Internet Explorer**: 6+ (legacy support)
### Exported File Examples
The library can parse bookmark files exported from:
```html
<!-- Chrome/Edge format -->
<!DOCTYPE NETSCAPE-Bookmark-file-1>
<!--This is an automatically generated file.-->
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
<TITLE>Bookmarks</TITLE>
<H1>Bookmarks</H1>
<DL><p>
<DT><H3 ADD_DATE="1640995200">Bookmarks bar</H3>
<DL><p>
<DT><A HREF="https://example.com" ADD_DATE="1640995200">Example</A>
</DL><p>
</DL>
<!-- Firefox format -->
<!DOCTYPE NETSCAPE-Bookmark-file-1>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
<TITLE>Bookmarks</TITLE>
<H1>Bookmarks Menu</H1>
<DL><p>
<DT><H3>Bookmarks Toolbar</H3>
<DL><p>
<DT><A HREF="https://example.com">Example Site</A>
</DL><p>
</DL>
```
## License
MIT License - See [LICENSE](LICENSE) file for details.
## Author
**grakeice**
- GitHub: [@grakeice](https://github.com/grakeice)
## Changelog
### v1.1.4 (Latest)
- 🛠️ **Improved parser** (better compatibility with various bookmark HTML formats, increased stability)
- 🧹 **Code refactoring and additional tests**
- 🐞 **Minor bug fixes**
- 📚 **Documentation Update**: API Reference refresh, unified explanations, added usage examples, revived various sections
### v1.1.3
- 📝 **Added JSDoc comments**: Added JSDoc-style comments to major classes and methods to improve type information and enable automatic API documentation generation
- 📚 **Documentation Update**: Added TypeScript import example for Node.js/npm in the Installation section
### v1.1.2
- 📝 **Documentation Enhancement**: Updated comprehensive README documentation with latest version references and improved examples
- 🔧 **Version Consistency**: Synchronized version numbers across all documentation and code examples
- 📚 **Content Updates**: Refined installation instructions, usage examples, and API documentation for better clarity
### v1.1.1
- 🛡️ **Security Enhancement**: [`BookmarksTree.prototype.HTMLText`](src/BookmarksTree/BookmarksTree.ts) now properly escapes HTML entities in bookmark titles and URLs
- 🔧 **Code Consistency**: Unified HTML escaping behavior between Node.js and browser versions
- 📝 **Documentation Updates**: Enhanced API documentation with security considerations
- 🐛 **Bug Fixes**: Minor stability improvements and edge case handling
### v1.1.0
- 🌐 **Browser Support**: Added web-optimized version for browser environments
- 📦 **Dual Entry Points**: Separate builds for Node.js/Deno (`./mod.ts`) and browsers (`./mod_web.ts`)
- ⚡ **Native DOM APIs**: Browser version uses native DOMParser and DOM APIs for better performance
- 🔧 **Build Optimization**: Enhanced build process with polyfill removal for browser compatibility
- 📚 **Updated Documentation**: Added browser usage examples and API reference
### v1.0.1
- ✨ **Core Features**: Complete HTML bookmark file parsing functionality
- 🏗️ **BookmarksTree Class**: Hierarchical structure management with Map interface
- 🔄 **Bidirectional Conversion**: JSON ↔ HTML ↔ DOM conversion support
- 🧪 **Comprehensive Testing**: Full test coverage with edge case handling
- 📦 **Multi-Runtime Support**: Native Deno and Node.js compatibility
- 🔧 **TypeScript Support**: Complete type definitions and IntelliSense
- 🤖 **CI/CD Pipeline**: Automated testing and publishing via GitHub Actions
- 📚 **Documentation**: Comprehensive README with examples and API reference
- 🌐 **International Support**: Unicode and multi-language bookmark handling
- ⚡ **Performance Optimized**: Efficient parsing for large bookmark collections
- 🛡️ **Error Handling**: Graceful handling of malformed HTML and invalid URLs
### v0.0.1-pre4
- Early pre-release version
- Core functionality proof of concept
- Basic parsing implementation
- Initial testing setup