Skip to content
@apify

Apify

We're making the web more programmable.

Pinned Loading

  1. crawlee-python crawlee-python Public

    Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Wo…

    Python 3.3k 223

  2. crawlee crawlee Public

    Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, an…

    TypeScript 13.8k 586

  3. proxy-chain proxy-chain Public

    Node.js implementation of a proxy server (think Squid) with support for SSL, authentication and upstream proxy chaining.

    JavaScript 808 137

  4. apify-sdk-js apify-sdk-js Public

    Apify SDK monorepo

    TypeScript 112 30

  5. got-scraping got-scraping Public

    HTTP client made for scraping based on got.

    TypeScript 432 33

  6. fingerprint-suite fingerprint-suite Public

    Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.

    TypeScript 833 88

Repositories

Showing 10 of 124 repositories
  • crawlee-python Public

    Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

    apify/crawlee-python’s past year of commit activity
    Python 3,315 Apache-2.0 223 53 6 Updated Jul 26, 2024
  • apify-cli Public

    Apify command-line interface helps you create, develop, build and run Apify actors, and manage the Apify cloud platform.

    apify/apify-cli’s past year of commit activity
    TypeScript 118 17 27 (1 issue needs help) 7 Updated Jul 26, 2024
  • apify-haystack Public

    The official integration for Apify and Haystack 2.0

    apify/apify-haystack’s past year of commit activity
    Python 0 Apache-2.0 0 0 0 Updated Jul 26, 2024
  • apify-docs Public

    This project is the home of Apify's documentation.

    apify/apify-docs’s past year of commit activity
    API Blueprint 24 Apache-2.0 70 69 23 Updated Jul 26, 2024
  • crawlee Public

    Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

    apify/crawlee’s past year of commit activity
    TypeScript 13,770 Apache-2.0 586 105 (1 issue needs help) 11 Updated Jul 26, 2024
  • apify-client-js Public

    Apify API client for JavaScript / Node.js.

    apify/apify-client-js’s past year of commit activity
    JavaScript 63 Apache-2.0 24 16 9 Updated Jul 26, 2024
  • apify-client-python Public

    Apify API client for Python

    apify/apify-client-python’s past year of commit activity
    Python 42 Apache-2.0 10 11 8 Updated Jul 25, 2024
  • apify-eslint-config-ts Public

    Typescript ESLint configuration shared across projects in Apify.

    apify/apify-eslint-config-ts’s past year of commit activity
    JavaScript 1 Apache-2.0 0 0 1 Updated Jul 25, 2024
  • apify-sdk-python Public

    The Apify SDK for Python is the official library for creating Apify Actors in Python. It provides useful features like actor lifecycle management, local storage emulation, and actor event handling.

    apify/apify-sdk-python’s past year of commit activity
    Python 113 Apache-2.0 9 19 6 Updated Jul 25, 2024
  • openapi Public

    An OpenAPI specification for the Apify API.

    apify/openapi’s past year of commit activity
    JavaScript 1 MIT 0 14 1 Updated Jul 25, 2024