UNPKG

naukri-ninja

Version:

Naukri automation tool to fetch, filter , and apply for jobs automatically using gen ai.

185 lines (147 loc) 7.86 kB
# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview **Naukri Ninja** is a Node.js CLI automation tool that streamlines job searching and applications on Naukri.com. It combines web scraping, AI-powered job matching, and automated job applications with quota management to help job seekers find and apply for suitable positions efficiently. ## Development Commands ```bash # Run the application npm run dev # Run in debug mode with inspector and verbose logging npm run debug # Package as Windows executable (generates ../naukri-ninja.exe) npm run pkg ``` ## Architecture ### Core Components The application is organized into these primary layers: 1. **Entry Point** (`index.js`) - Main CLI loop managing the job search workflow - Coordinates profile selection, job fetching, suitability matching, and applications - Handles quota management and daily limits - Integrates analytics tracking throughout the process 2. **API Layer** (`api.js`) - All Naukri.com HTTP requests are wrapped here - Key endpoints: - `searchJobsAPI()` - Fetch job listings by keyword/location - `getJobDetailsAPI()` - Get detailed job information - `applyJobsAPI()` - Submit job applications - `loginAPI()` - User authentication - `getProfileDetailsAPI()` - Fetch user profile data - `matchScoreAPI()` - Get server-side job match scores - Uses custom headers with authentication tokens for Naukri requests 3. **AI/Matching Layer** (`gemini.js`, `utils/geminiUtils.js`) - Integrates Google's Gemini API for intelligent job matching - Supports both Vertex AI (service account) and Generative AI (API key) authentication - Main functions: - `checkSuitability()` - Evaluates if a job matches user's profile using embeddings - `answerQuestion()` - Generates contextual answers to application questionnaires - Uses AI prompts from `utils/genAiPrompts.js` - Question embeddings stored in memory and cached for efficiency 4. **User/Profile Management** (`utils/userUtils.js`) - Profile construction from Naukri API responses - User preferences for job search filters and matching strategy - Profile compression for efficient storage 5. **Job Processing** (`utils/jobUtils.js`) - `findNewJobs()` - Search and collect job IDs from Naukri - `applyForJobs()` - Submit applications in batches - `getJobInfo()` - Fetch detailed info for jobs (batch processing) - `handleQuestionnaire()` - Process and answer application questions 6. **Vector Search & Embeddings** (`vectorSearch.js`, `utils/embeddings.js`, `utils/embeddingUtils.js`) - Document chunking and embedding generation via Gemini - Cosine similarity-based search for finding relevant job context - Reusable embeddings cached in JSON files for performance - Powers contextual answering for job questionnaires 7. **I/O and Utilities** - `utils/ioUtils.js` - File operations, readline interface, JSON serialization - `utils/prompts.js` - Interactive CLI menus and user input - `utils/spinniesUtils.js` - Spinner/loading indicators - `utils/helper.js` - In-memory localStorage implementation, date formatting - `utils/cmdUtils.js` - System command execution (open URLs, folders) - `utils/analyticsUtils.js` - Event tracking and analytics - `utils/emailTemplate.js` - HTML email template for HR outreach 8. **Update System** - `utils/updater.js` - Version checking and update orchestration - `updateZip.js` - Downloads and extracts update zip from GitHub releases - `update-helper.js` / `update-helper.cjs` - Applies file replacements during self-update (spawned as a separate process so the main exe can be overwritten) - `updateFunctionality/downloadLatestExeFromGitHub.js` - Alternate exe-based update download ### Data Flow ``` User starts app ↓ Select/manage profile ↓ Configure preferences (pages, daily quota, matching strategy) ↓ Search for jobs (searchJobsAPI) ↓ For each job: ├─ Check if already applied ├─ Evaluate suitability (AI matching via Gemini) ├─ If suitable & not applied: │ ├─ Fetch job details │ ├─ Get application questions │ ├─ Generate contextual answers (embeddings + AI) │ └─ Submit application (applyJobsAPI) └─ Track quota and results ↓ Write results to CSV ``` ## Key Patterns ### Authentication & State Management - Uses Naukri auth tokens (Bearer + cookies) stored in custom headers - Session state via `localStorage` (in-memory object in `utils/helper.js`) - User profile and preferences persisted in JSON files ### Job Matching Strategy - Multiple strategies: `matchingStrategy()` in `utils/utils.js` uses either: - **Server-side**: Naukri's match score API - **AI-based**: Gemini embeddings comparing job description to user skills - **Manual**: User reviews each job before applying ### Batch Processing - Jobs processed one at a time with detailed per-job output - API calls use batching (e.g., `getJobInfo()` processes jobs in configurable batch sizes) - Rate limiting via spinners and sequential processing ### AI Integration - Generative AI prompts in `utils/genAiPrompts.js` for job analysis - Embeddings cached in JSON files to avoid regenerating vectors - Context window optimization by searching relevant doc chunks before sending to Gemini ### Configuration - User preferences stored as JSON (genAiConfig, matching strategy, daily quota) - Supports multiple profiles with different preferences - API keys and service account files stored in `/apikeys` folder (git-ignored) ## Development Notes ### Adding New Job Matching Logic - Matching strategies defined in `utils/utils.js` - add to `matchingStrategy()` - Prompts for AI evaluation in `utils/genAiPrompts.js` - Suitability checking in `utils/geminiUtils.js:checkSuitability()` ### Extending Application Answering - Question answering logic in `utils/geminiUtils.js:answerQuestion()` - Uses cached question embeddings for retrieval - Embedding utilities in `utils/embeddingUtils.js` ### API Changes - All Naukri endpoints in `api.js` - Headers are built dynamically with authentication in `getHeaders()` - Add new endpoints following existing pattern with proper auth headers ### Testing Considerations - No test suite currently (see `package.json` - test script echoes error) - Manual testing via `npm run dev` or `npm run debug` - Debug output controlled by `--inspect` flag in process.execArgv ### CI/CD Pipeline - GitHub Actions workflow in `.github/workflows/publish-and-package.yml` - On push to main: bumps version, publishes to npm, packages exe, creates GitHub release - Version bumping based on commit message keywords (feat=minor, fix=patch, major=breaking) ## Important Implementation Details - **No persistent database**: All state is in-memory or JSON files - **Naukri headers critical**: API calls fail without proper headers (`appid`, `clientid`, `gid`, etc.) from `api.js` - **AI API keys**: Stored locally in code or `/apikeys` folder - never commit - **Daily quota**: Managed by Naukri server - local quota tracking for user reference - **Error handling**: Selective retry logic for network; specific error codes (409001, 401, 403) handled - **Spinner management**: Start/stop spinners to show progress - check `utils/spinniesUtils.js` for API ## Dependencies - **@google-cloud/vertexai** - Google Vertex AI for embeddings/generation (service account auth) - **@google/generative-ai** - Google Gemini API (API key auth) - **@inquirer/prompts** - Interactive CLI prompts - **csv-writer** - Write job results to CSV - **nodemailer** - Email sending capability - **unzipper** - Handle zip files for updates - **node-fetch** - HTTP requests - **spinnies** - CLI loading spinners