Migrate to Netlify Today

Netlify announces the next evolution of Gatsby Cloud. Learn more

ContactSign Up
Official Plugin
View plugin on GitHub

gatsby-source-filesystem

A Gatsby plugin for sourcing data into your Gatsby application from your local filesystem.

The plugin creates File nodes from files. The various transformer plugins can transform File nodes into other types of data e.g. gatsby-transformer-json transforms JSON files into JSON nodes and gatsby-transformer-remark transforms markdown files into MarkdownRemark nodes.

Install

npm install gatsby-source-filesystem

How to use

You can have multiple instances of this plugin in your gatsby-config to read files from different locations on your filesystem. Be sure to give each instance a unique name.

gatsby-config.js
module.exports = {
  plugins: [
    {
      resolve: `gatsby-source-filesystem`,
      options: {
        // The unique name for each instance
        name: `pages`,
        // Path to the directory
        path: `${__dirname}/src/pages/`,
      },
    },
    {
      resolve: `gatsby-source-filesystem`,
      options: {
        name: `data`,
        path: `${__dirname}/src/data/`,
        // Ignore files starting with a dot
        ignore: [`**/\.*`],
        // Use "mtime" and "inode" to fingerprint files (to check if file has changed)
        fastHash: true,
      },
    },
  ],
}

In the above example every file under src/pages and src/data will be made available as a File node inside GraphQL. You don’t need to set up another instance of gatsby-source-filesystem for e.g. src/data/images (since those files are already sourced). However, if you want to be able to filter your files you can set up a new instance and later use the sourceInstanceName.

Options

name

Required

A unique name for the gatsby-source-filesytem instance. This name will also be a key on the File node called sourceInstanceName. You can use this e.g. for filtering.

path

Required

Path to the folder that should be sourced. Ideally an absolute path.

ignore

Optional

Array of file globs to ignore. They will be added to the following default list:

**/*.un~
**/.DS_Store
**/.gitignore
**/.npmignore
**/.babelrc
**/yarn.lock
**/node_modules
../**/dist/**

fastHash

Optional

By default, gatsby-source-filesystem creates an MD5 hash of each file to determine if it has changed between sourcing. However, on sites with many large files this can lead to a significant slowdown. Thus you can enable the fastHash setting to use an alternative hashing mechanism.

fastHash uses the mtime and inode to fingerprint the files. On a modern OS this can be considered a robust solution to determine if a file has changed, however on older systems it can be unreliable. Therefore it’s not enabled by default.

Environment variables

  • GATSBY_CONCURRENT_DOWNLOAD (default: 200). To prevent concurrent requests you can configure the concurrency of processRemoteNode.

If you have a spotty network or slow connection, you can adjust the retries and timeouts:

  • GATSBY_STALL_RETRY_LIMIT (default: 3)
  • GATSBY_STALL_TIMEOUT (default: 30000)
  • GATSBY_CONNECTION_TIMEOUT (default: 30000)

How to query

You can query the File nodes as following:

{
  allFile {
    nodes {
      extension
      dir
      modifiedTime
    }
  }
}

Use GraphiQL to explore all available keys.

To filter by the name you specified in the gatsby-config, use sourceInstanceName:

{
  allFile(filter: { sourceInstanceName: { eq: "data" } }) {
    nodes {
      extension
      dir
      modifiedTime
    }
  }
}

Helper functions

gatsby-source-filesystem exports three helper functions:

createFilePath

When building pages from files, you often want to create a URL from a file’s path on the filesystem. For example, if you have a markdown file at src/content/2018-01-23-my-blog-post/index.md, you might want to turn that into a page on your site at example.com/blog/2018-01-23-my-blog-post/. createFilePath is a helper function to make this task easier.

createFilePath({
  // The node you'd like to convert to a path
  // e.g. from a markdown, JSON, YAML file, etc.
  node,
  // Method used to get a node
  // The parameter from `onCreateNode` should be passed in here
  getNode,
  // The base path for your files.
  // It is relative to the `options.path` setting in the `gatsby-source-filesystem` entries of your `gatsby-config`.
  // Defaults to `src/pages`. For the example above, you'd use `src/content`.
  basePath,
  // Whether you want your file paths to contain a trailing `/` slash
  // Defaults to true
  trailingSlash,
})

Example

gatsby-node.js
const { createFilePath } = require(`gatsby-source-filesystem`)

exports.onCreateNode = ({ node, getNode, actions }) => {
  const { createNodeField } = actions
  // Ensures we are processing only markdown files
  if (node.internal.type === "MarkdownRemark") {
    // Use `createFilePath` to turn markdown files in our `src/content` directory into `/blog/slug`
    const relativeFilePath = createFilePath({
      node,
      getNode,
      basePath: "src/content",
    })

    // Creates new query'able field with name of 'slug'
    createNodeField({
      node,
      name: "slug",
      value: `/blog${relativeFilePath}`,
    })
  }
}

createRemoteFileNode

When building source plugins for remote data sources (Headless CMSs, APIs, etc.), their data will often link to files stored remotely that are often convenient to download so you can work with them locally.

The createRemoteFileNode helper makes it easy to download remote files and add them to your site’s GraphQL schema.

While downloading the assets, special characters (regex: /:|\/|\*|\?|"|<|>|\||\\/g) in filenames are replaced with a hyphen ”-“. When special characters are found a file hash is added to keep files unique e.g a:file.jpg becomes a-file-73hd.jpg (as otherwise a:file.jpg and a*file.jpg would overwrite themselves).

createRemoteFileNode({
  // The source url of the remote file
  url: `https://example.com/a-file.jpg`,
  // The id of the parent node (i.e. the node to which the new remote File node will be linked to)
  parentNodeId,
  // Gatsby's cache which the helper uses to check if the file has been downloaded already. It's passed to all Node APIs.
  getCache,
  // The action used to create nodes
  createNode,
  // A helper function for creating node Ids
  createNodeId,
  // OPTIONAL
  // Adds htaccess authentication to the download request if passed in.
  auth: { htaccess_user: `USER`, htaccess_pass: `PASSWORD` },
  // OPTIONAL
  // Adds extra http headers to download request if passed in.
  httpHeaders: { Authorization: `Bearer someAccessToken` },
  // OPTIONAL
  // Sets the file extension
  ext: `.jpg`,
})

Example

The following example is pulled from the Preprocessing External Images guide. Downloaded files are created as File nodes and then linked to the MarkdownRemark node, so it can be used with e.g. gatsby-plugin-image. The file node can then be queried using GraphQL.

gatsby-node.js
const { createRemoteFileNode } = require("gatsby-source-filesystem")

exports.onCreateNode = async ({
  node,
  actions: { createNode, createNodeField },
  createNodeId,
  getCache,
}) => {
  // For all MarkdownRemark nodes that have a featured image url, call createRemoteFileNode
  if (
    node.internal.type === "MarkdownRemark" &&
    node.frontmatter.featuredImgUrl !== null
  ) {
    const fileNode = await createRemoteFileNode({
      url: node.frontmatter.featuredImgUrl, // string that points to the URL of the image
      parentNodeId: node.id, // id of the parent node of the fileNode you are going to create
      createNode, // helper function in gatsby-node to generate the node
      createNodeId, // helper function in gatsby-node to generate the node id
      getCache,
    })

    // if the file was created, extend the node with "localFile"
    if (fileNode) {
      createNodeField({ node, name: "localFile", value: fileNode.id })
    }
  }
}

Retrieving the remote file name and extension

The helper first tries to retrieve the file name and extension by parsing the url and the path provided (e.g. if the url is https://example.com/image.jpg, the extension will be inferred as .jpg and the name as image). If the url does not contain an extension, createRemoteFileNode use the file-type package to infer the file type. Finally, the name and the extension can be explicitly passed, like so:

createRemoteFileNode({
  // The source url of the remote file
  url: `https://example.com/a-file-without-an-extension`,
  parentNodeId: node.id,
  getCache,
  createNode,
  createNodeId,
  // if necessary!
  ext: `.jpg`,
  name: `image`,
})

createFileNodeFromBuffer

When working with data that isn’t already stored in a file, such as when querying binary/blob fields from a database, it’s helpful to cache that data to the filesystem in order to use it with other transformers that accept files as input.

The createFileNodeFromBuffer helper accepts a Buffer, caches its contents to disk, and creates a File node that points to it.

The name of the file can be passed to the createFileNodeFromBuffer helper. If no name is given, the content hash will be used to determine the name.

Example

The following example is adapted from the source of gatsby-source-mysql:

gatsby-node.js
const createMySqlNodes = require(`./create-nodes`)

exports.sourceNodes = async ({ actions, createNodeId, getCache }, config) => {
  const { createNode } = actions
  const { conn, queries } = config
  const { db, results } = await query(conn, queries)

  try {
    queries
      .map((query, i) => ({ ...query, ___sql: results[i] }))
      .forEach(result =>
        createMySqlNodes(result, results, createNode, {
          createNode,
          createNodeId,
          getCache,
        })
      )
    db.end()
  } catch (e) {
    console.error(e)
    db.end()
  }
}

// create-nodes.js
const { createFileNodeFromBuffer } = require(`gatsby-source-filesystem`)
const createNodeHelpers = require(`gatsby-node-helpers`).default

const { createNodeFactory } = createNodeHelpers({ typePrefix: `mysql` })

function attach(node, key, value, ctx) {
  if (Buffer.isBuffer(value)) {
    ctx.linkChildren.push(parentNodeId =>
      createFileNodeFromBuffer({
        buffer: value,
        getCache: ctx.getCache,
        createNode: ctx.createNode,
        createNodeId: ctx.createNodeId,
      })
    )
    value = `Buffer`
  }

  node[key] = value
}

function createMySqlNodes({ name, __sql, idField, keys }, results, ctx) {
  const MySqlNode = createNodeFactory(name)
  ctx.linkChildren = []

  return __sql.forEach(row => {
    if (!keys) keys = Object.keys(row)

    const node = { id: row[idField] }

    for (const key of keys) {
      attach(node, key, row[key], ctx)
    }

    node = ctx.createNode(node)

    for (const link of ctx.linkChildren) {
      link(node.id)
    }
  })
}

module.exports = createMySqlNodes
© 2024 Gatsby, Inc.