@workday/canvas-kit-docs
Version:
Documentation components of Canvas Kit components
377 lines (301 loc) • 18.8 kB
text/mdx
# Testing
`canvas-kit` has 3 levels of testing: unit, functional and visual. Each area has special use-cases
even though tests at various levels may overlap. Here's a simplified explanation of what each level
is meant to test:
- **Unit**: This level is meant to test the input/output of units of functionality (e.g. utility
functions or input props of components that output DOM).
- **Functional**: This level is browser-based testing and is where UI behavior and accessibility
requirements are specified (e.g. Given a modal example, When clicking on the button, Then the
modal should be open). Functional tests are against Storybook Stories.
- **Visual**: This level is where all styling is tested. Some props only have visual effects and it
is difficult to reason about at any other level. All Storybook stories can be candidates for
visual regression based on an opt-in parameter.
Tests are supposed to be a source of feedback and confidence. To achieve that, a test should be
clear about intent and failures. Developers tend to ignore passing tests and focus only on failing
tests. If the intent of the test cannot be determined, it is difficult to fix the test. If the
failure isn't immediately obvious, too much time is spent fixing the test. Canvas Kit adopts a BDD
(behavior-driven development) approach of testing. The most common way to think is about behavior is
"given/when/then". These behaviors are a series of specifications and not necessarily "tests".
Testing in this repo is not just testing the correctness of our components, but running a series of
specifications. Specifications tend to be more granular to give context of what code is expected to
do. A "test" simply has to pass. A specification requires meaning.
**Bad**:
```tsx
test('SomeComponent should render correctly', async () => {
const {getByTestId} = render(<SomeComponent data-testid="test" text="foo" />);
const component = getByTestId('test');
expect(component.textContent).toEqual('foo');
expect(component.getAttribute('aria-label')).toEqual('foo');
});
```
This is a "test" and not a specification, but a test that asserts multiple outputs. If any of the
expectations fail, the CI will output something like
`Failed: SomeComponent should render correctly`. The message doesn't give any indication of why any
of the expectations would fail. Furthermore, the failures don't give much indication as to what is
failing. Without looking at the code, the failure message of the expectation will say something like
"Expected 2 to equal 1" or "Expected 'bar' to equal 'foo'". Often to find out why the test fails,
the source code of the test has to be consulted and the test has to be debugged. After that, there
is no context to why the failure even matters. This test doesn't give much feedback or confidence on
failure and thus isn't a very useful test.
**Good**:
```tsx
describe('SomeComponent', () => {
it('should render the "text" prop as the text', async () => {
const {getByTestId} = render(<SomeComponent data-testid="test" text="foo" />);
const component = getByTestId('test');
expect(component).toHaveTextContent('foo');
});
it('should render the "text" prop as an aria-label for accessibility', async () => {
const {getByTestId} = render(<SomeComponent data-testid="test" text="foo" />);
const component = getByTestId('test');
expect(component).toHaveAttribute('aria-label', 'foo');
});
});
```
This series of specifications give more context to the test. It is more code, but it is clearer what
the intent of each specification is. If one fails, it will be easier to diagnose what and why. The
assertions also make use of `jest-dom` which gives more context into _how_ it fails. Instead of a
generic assertion, the CLI will output something like
`Expected <div data-testid="test" aria-label="bar">bar</div> to have an "aria-label" of "foo" but received "bar"`.
**The specification should tell us which spec failed and the assertion gives insight into how.**
### Selectors
Accessible components tend to have `role`s and ARIA attributes to help with selection of specific
elements within a component. React Testing Library comes with some selection helpers based on these
attributes. CSS class names should not be used for selecting components. `data-testid` should be
used as a last resort and usually is reserved for examples or an application to help disambiguate
components. Hard-coded `data-testid` should be avoided in component source code, but are fine in
unit and functional tests provided the test supplies the `data-testid` to make the test more
understandable. Additional info can be found
[here](https://kentcdodds.com/blog/making-your-ui-tests-resilient-to-change).
## Unit tests
Canvas Kit uses [Jest](https://jestjs.io/),
[React Testing Library](https://testing-library.com/docs/react-testing-library/intro), and
[jest-dom](https://github.com/testing-library/jest-dom) which have a nice API for testing React
components. Unit tests should test inputs and outputs of utility functions or props and output of
components. For example, if documentation says that a component will pass extra props to a container
component, there should be a specification that verifies this claim. If a component's documentation
states that children will be rendered, a test should verify the children exist within the containing
component.
Good unit tests avoid too much implementation detail. For example, if a component states it will
render the children prop, the documentation and specifications shouldn't state where unless it is
specific to the functionality of the component. Tests here should define the API surface area that
will be guaranteed. The more implementation details, the more difficult the tests are to maintain.
**Bad**:
```tsx
const {container} = render(
<SomeComponent>
<div data-testid="test">Test</div>
</SomeComponent>
);
expect(container.children[0].children[0].textContent).toEqual('Test');
```
This test encodes the DOM structure into it. Any changes to the source code will require a change to
the test making the test tightly coupled and fragile.
**Good**:
```tsx
const {getByTestId} = render(
<SomeComponent data-testid="container">
<div data-testid="test">Test</div>
</SomeComponent>
);
const component = getByTestId('container');
const child = getByTestId('test');
expect(component).toContainElement(child);
```
This example is more flexible. The component can be refactored with DOM elements added or removed
and the test will still pass as long as the `div` is a child of the component somewhere (the
specification is met). Also this example will have a more useful failure message since
`.toContainElement` has the context that it is expecting an element in another element vs a match of
a string.
### Snapshot tests
Canvas Kit does not contain DOM-based snapshot tests and uses [Visual Tests](#visual-tests) instead.
DOM snapshots failures are often difficult to parse. Humans tend to be better at noticing and
discerning visual changes than changes to a DOM structure.
If your project uses snapshot tests and Canvas Kit, chances are you'll run into issues with changing
ids and other ARIA attributes. Canvas Kit generates unique ids that are different every time the
page loads. This can be a problem with snapshot tests. To fix this, you'll need to add special code
to your test bootstrap file. For example:
```ts
import {setUniqueSeed, resetUniqueIdCount} from '/canvas-kit-react/common';
beforeEach(() => {
setUniqueSeed('a'); // set a static seed
resetUniqueIdCount(); // reset the id counter before every test
});
```
This will ensure snapshot tests have stable ids for each snapshot. It is still possible to get ids
changing if you add an additional component that uses id generation - subsequent ids will be
different, but this will prevent snapshot tests that don't have any changes from showing diffs.
The Canvas Kit Styling package uses CSS Variables and multiple class names with unique hashes.
The following snapshot serializers handle styling. Setting unique seeds will not effect static style
hashes because the styles are created before any test code is run. These serializers ignore the
hashes instead.
```ts
// Handle `css-{hash}` class names
const emotionClassnameRegex = /(css-[a-zA-Z0-9]{1,7})/g;
expect.addSnapshotSerializer({
test: (val) => typeof val === "string" && emotionClassnameRegex.test(val),
print: (val) => {
return `"${val.replaceAll(emotionClassnameRegex, "css-xxxxx")}"`;
},
});
// Handle `m{hash}` class names
const emotionModifierClassnameRegex = /^m[a-zA-Z0-9]{5,7}/g;
expect.addSnapshotSerializer({
test: (val) => typeof val === "string" && emotionModifierClassnameRegex.test(val),
print: (val) => {
return `"${val.replaceAll(emotionModifierClassnameRegex, "m-xxxxx")}"`;
},
});
// Handle `--myVariableName-{hash}` CSS variables used by Stencils
const cssVariableRegex = /(^--([a-zA-Z-]+)-[a-zA-Z0-9]{1,7})/g;
expect.addSnapshotSerializer({
test: (val) => typeof val === "string" && cssVariableRegex.test(val),
print: (val) => {
return `"${val.replaceAll(cssVariableRegex, "--$2-xxxxx")}"`;
},
});
```
## Functional tests
Canvas Kit uses [Cypress](https://cypress.io) for browser-based behavior testing (additional info:
[Why Cypress?](https://github.com/Workday/canvas-kit/tree/master/cypress/WHY_CYPRESS.md)).
Functional tests ensure the code meets functional specifications from a user's perspective. All
functional tests are written against a Storybook Story. Specifications can come from many different
stakeholders including product management, user research, and accessibility. A functional test
should read in the Given/When/Then format and fit in well as acceptance criteria.
Example of a usability specification: "Given the modal example, when a user clicks the 'Delete Item'
button, then the modal opens".
Example of an accessibility specification: "Given the modal example, when a user clicks the 'Delete
Item' button, then the focus should be transferred to the first focusable element in the dialog"
The BDD-style (behavior-driven development) syntax lends well to the nesting of actions performed on
a example (also called "story" borrowed from Storybook nomenclature). The wording used inside the
BDD blocks should read in plain English and be understandable by non-developers. The implementation
should be as readable as possible by non-developers for trust and verification. The intent of the
wording of the BDD-style syntax is to parse into an outline form and go into documentation as a
verification of what specifications are covered.
Example:
```ts
describe('Given the Default Modal is rendered', () => {
beforeEach(() => {
// render the default modal example
});
context('when the target button is clicked', () => {
beforeEach(() => {
// click the target button
});
it('should open the modal', () => {
// expect modal to be open
});
});
});
```
The outline would look like:
- Given the Default Modal is rendered
- when the target button is clicked
- then it should open the modal
Tests should rely on ARIA attributes as much as possible for selecting and asserting. Test IDs
should be used only to disambiguate DOM elements.
Read [Intro to Cypress](https://docs.cypress.io/guides/core-concepts/introduction-to-cypress.html)
for an intro to how Cypress works. The core concept to Cypress is enqueued command chains that
typically wrap jQuery collections. Cypress uses jQuery, Mocha, Chai, Sinon, Lodash and Moment. All
are exposed to the developer of the test.
Let's break that down. Cypress has a jQuery-like chainable API to select and interact with DOM
nodes. Cypress controls when and how often a command is run to ensure assertions either pass or time
out. This is most confusing to those who approach Cypress commands as Promises.
[Commands are not Promises](https://docs.cypress.io/guides/core-concepts/introduction-to-cypress.html#Commands-Are-Not-Promises).
For example, imagine the following Cypress code:
```ts
cy.get('body').contains('button', 'Delete Item').click();
cy.get('[data-testid="MyResult"]').should('contain', 'Success!');
```
The Cypress Command queue will show `['get', 'contains', 'click', 'get', 'should']`, but it will not
run these commands immediately. There is also no need to use `.then` to wait for any previous
condition to be met first. The Cypress run-time will do the following:
- Run `$('body')` every 50ms until `$('body')` returns a non-empty jQuery collection.
- Run a function that gets all `button` elements inside of the `body` element that have text content
that contains 'Delete Item' and that jQuery collection is not empty.
- Ensure the previous element is clickable (visible, not covered, has actual width/height), waiting
50ms to detect animations (waiting for animations to complete), etc. Basically, make sure a user
could actually click on it.
- Run `$('[data-testid="MyResult"]')` every 50ms until it returns a non-empty jQuery collection.
Usually clicking has side-effects like network requests to the server before the side-effect is
reflected in the DOM. Since Cypress implicitly retries, this command will _eventually_ return a
non-empty jQuery collection and the author doesn't need to explicitly do anything. Note: this
selector _could_ return a non-empty jQuery collection instantly, but it might not contain
"Success!" yet...
- Run internal `chai` + `chai-jquery` matchers (`expect($el).to.contain('Success!')`) until an error
is no longer thrown. `.should` will re-run the previous command _and_ the assertion until the
condition is met or a timeout occurs. This is the secret sauce to the stability of Cypress tests
compared to WebDriver. `.should` can also receive a function that will be run until no error is
thrown. If an error is never thrown, it will only be run once.
Cypress is very easy to get started with, but there isn't enough documentation about how to scale a
complicated suite of Cypress tests.
### Component helpers
Component helpers should be written to help with automation. These component helpers are useful for
a more readable implementation as well as more maintainable test code. Component helpers can also be
exported by this repository for use in downstream testing (e.g. Cypress tests in applications that
use Canvas Kit). Helpers contain implementation details about how components are composed
internally. The goal is that if the component's implementation changes, the component helpers should
be the only code that needs to change to keep tests passing.
Component helpers come in 3 flavors:
- **Starter**: This helper type starts a Cypress chain. The return type is always
`Cypress.Chainable<JQuery>` and usually contains a `cy.get` and the selector usually contains
something that is unique to that component type. For example, the Modal helper uses
`[role=dialog]` with an option to disambiguate with a `data-testid`.
- **Scoping**: Imagine a web application is a collection of boxes that determine how components of
an application are laid out. A scoping helper takes in a larger box and returns a smaller box,
further reducing the scope. Ideally these helpers contain no Cypress commands (which are
asynchronous), only jQuery calls (which are synchronous). Synchronous helper functions can be run
more than once at the discretion of the Cypress runtime. A good use-case for this type is to
replace CSS selectors.
- **Action**: Cypress comes with many baked in action commands such as `click` and `type`. A custom
action is useful to give a higher level semantic meaning to a series of low-level commands. For
example, a Select component might compose a Popup and a virtualized menu. This action may include
opening the select, finding the popup element, seeing if the desired item is currently visible,
scrolling the virtualized menu until the desired item is visible and clicking on the item. This
code would be repeated for every instance and has implementation knowledge of the component. A
high-level semantic action might be called "select" that wraps up this implementation detail to
free up automation and development engineers to think about the test case and not about
implementation details.
## Visual Tests
Canvas Kit uses [ChromaticQA](https://www.chromaticqa.com/) for visual tests which are run on
stories that are opted-in through a special parameter. All the visual states should be represented
at least one story. This way all of the visual states of a component can be visually regression
tested without requiring the test to interact with the UI. These states should include loading
states, error states, etc. Stories created for visual regression tests should avoid dynamic
solutions like [knobs](https://github.com/storybookjs/storybook/tree/next/addons/knobs).
To make a story runnable in Chromatic for visual testing, add the following to the story's parameter
list:
```tsx
// CSF - All stories in file
export default withSnapshotsEnabled({
// CSF details (title, component, etc.)
});
// or CSF - Specific story
export const MyVisualStory = () => {
// story contents
};
withSnapshotsEnabled(MyVisualStory);
```
Not all visual states are application states (E.g. `focus`, `hover`, and `active` on a button
element). For these pseudo-selectors, the `StaticStates` wrapper can be used where pseudo-selectors
are turned into CSS class names instead (E.g. `:focus` turns into `.focus`) and a class name can be
added to the component. Note: this turns off the psuedo-selectors entirely.
The following will render a button where the focus styling is always applied:
```tsx
export const States = () => {
<StaticStates>
<SecondaryButton className="focus">My Button</SecondaryButton>
</StaticStates>;
};
```
**Note:** This will only work correctly if the component passes on extra attributes to a container
that contains pseudo-selector code (most components do this). If a pseudo-selector is defined on an
element inside the container, this won't work.
`StaticStates` allows for visual styling, like focus rings, to be tested automatically.
Some components may be harder to test statically due to animations or require user interaction
normally. For example, Toasts animate in and Tooltips require a user to mouse-over the target
element. Components should be written in a way that allows these animations to be disabled.
## Summary
- Unit tests should cover APIs exposed to developers
- Functional tests should be written as a series of specifications of behavior and accessibility
- Visual tests should cover all visually representable states of components