Mocking YAML imports in Vitest
Over in Streetmix world I’d been slowly convincing myself to convert hard-coded configuration data from JSON to YAML, because it’s an easier format to work with if you want to type the minimum amount of keystrokes and move on with your life. Just let the computer do the conversion to JavaScript, ya know?
It turns out that switching wasn’t hard at all. Streetmix uses Parcel for bundling and Vitest for testing, and both ecosystems support YAML via plugins, which are pretty well-documented, so I won’t rehash those here.
This blog entry, then, is about a problem I ran into while trying to mock a YAML file in a test suite.
In many of our test suites, I use a dummy data file (aka a mock) in place of the original one. This is because some of the original files are very large or change with some regularity, which can generate large snapshots that break whenever those files are modified. I would prefer to keep the tests minimal and contained, working with placeholder data.
So in Vitest, one would mock an imported JSON file like so:
// my-module.js
import MY_DATA from './data.json'
// my-module.test.js
vi.mock('./data.json', () => ({
default: require('./__mocks__/data.json')
}))
While data.json
is real application data, __mocks__/data.json
would be a simplified facsimile of it — just enough to make sure our code works as intended under test.
With the YAML plugins in place, replacing the data file is easy-peasy.
// my-module.js
import MY_DATA from './data.yaml'
But, unfortunately, doing the same in the test suite fails.
// my-module.test.js
vi.mock('./data.yaml', () => ({
default: require('./__mocks__/data.yaml')
}))
// --> Vitest error
So… what now?
The incoming mock data was a raw YAML document, and not readable by JavaScript. After a little bit of floundering and experimentation, I started to recognize that the YAML plugin only intercepted the import
syntax from ECMAScript modules, but not the CommonJS require()
function, even if the latter was still supported by Vitest.
The solution? Replace require()
with import()
:
vi.mock('./data.yaml', async () =>
await import('./__mocks__/data.yaml')
)
There’s a couple of important things to point out here.
The first thing is that I’m using a dynamic import, which is different from a top-level import. A dynamic import looks like a function and works like one too, meaning you can put it anywhere in the module, rather than use a special declaration at the top.
So why couldn’t we just use a top-level import to bring it in as a variable, and then assign that variable to the return value of the mock? That’s because Vitest hoists the mock function, and it can’t access any variables outside of it — and that includes imported files. That means the import
has to happen inside its factory function, so we use the dynamic import.
Secondly, import()
is an asynchronous function, so the factory function must be prefaced with the async
keyword, and we’d need to await
the result of that import.
And that’s it. Now the YAML plugin takes over during import, and the test suite succeeds.
Anyway, I wrote all this for a reason
I don’t have a ton of programming-related blog posts (heck, I don’t really blog much at all, just look around), so I find it a little laughable that this, of all the things I have to deal with in code world, is the problem that merits one.
But, hang on, I can explain myself!
There’s a lot of developers out there in this age of AI-powered everything that swear upon the holy name of Marc Andreesen that “AI” has dramatically ten-exxed their productivity and output. As a skeptic of early-adopter technology (many of them operate as financial instruments whose investors are paid out when boosters can artificially inflate its value) I have been observing AI at arm’s length, rather than embracing it wholesale.
I do find LLM chatbots potentially useful for the next generation of information discovery, in the way that search engines were in the late 1990s. And, like search engines, our current iteration of AI is only as useful as the quality of information that already exists out in the wild. (This is something many other tech writers have already noted; I’m hardly declaring anything novel here.) Initially, I didn’t find anything related in the plugin repository’s issues or discussions, so I went to GitHub Copilot to see if it might have scraped something somewhere else:
I’ll cut to the chase: it didn’t know. It had no idea! As far as this model was concerned, there just isn’t any documentation that matched my specific usage.
To its credit, Copilot tried to be helpful, as in “why don’t you just write out its contents inline as a JS object?” But that’s not what I wanted to do! (And the less said about mock the entire plugin, the better.) Now, I’ve been programming long enough to know that you shouldn’t just blindly accept any answer to a question. Thankfully, I also have a good enough problem-solving sense, so with a little bit of adjustment of the ol’ thinking cap, I was able to find a better solution.
A brief light in the dark forest
Platforms have an inherent incentive to wall off its gardens, keeping its users and any content they’ve made trapped inside. If a competitor operates a large language model trained on any data it can get ahold of, that incentive only deepens. And so we have a double-whammy of a situation: information increasingly produced only inside of silos which put up ever-more barriers to access.
If I had posed my problem in Vite’s Discord server, I might have actually received an answer in about the same time that it took for me to figure it out on my own. But Discord is another walled garden. It isn’t anonymously searchable from the wider web, the way Stack Overflow is. If you ask for technical support in Discord, the solution disappears into barely-accessible history.
And now, there’s an activist movement poison the AI well by taking content off the Internet, or obfuscating it in a way to make it useless to those models. This actually makes a lot of sense to me as a resistance move. After all, people have been “creating” content, often unpaid, for decades, and corporations profit handsomely from it.
So all of this is a downward spirial into the Dark Forest theory of the Internet: where most of the real people are quiet and insular, and all of the chatter are bots.
It’s a shitty outcome. And that’s an Internet incapable of helping me solve a problem when I needed to import a YAML file into Vitest.
So here’s a blog post. It’s not on a platform, and not in a walled garden. I don’t really care about SEO, but I hope the search engines find it anyway. And I hope the AI models do too. Get it in there. Maybe someday someone else will ask a similar question, and next time, the bot will know.