Paul Mirocha UX Design

A government agency published an environmental review when this LNG terminal was built. There’s a section about impacts on local fishermen.

 
 
 
 
 
 

How would you find a copy?

 

Finding the Needle: Simplifying Search for Massive, Complex, and Unorganized Document Collections

Title 

NEPAccess.org

Organization

The Udall Center for Studies in Public Policy based at the University of Arizona, funded by the National Science Foundation.

Summary

We used design and data science to collect and organize a searchable database of millions of pages of complex and unstructured government technical documents, usable by six persona types, professionals as well as citizens. 

2020-2023

Roles

  • As the sole UX researcher and UI designer, I was responsible for shaping the overall design direction of this project.

Team

  • I worked with a developer and an interdisciplinary team of data scientists, machine-learning researchers, environmental and social scientists, public policy experts, lawyers, and students.

Tools

  • Moderated user testing interviews with Zoom.us. Users set up realistic scenarios for a usability walkthrough. I recorded and logged notes using Airtable.
  • Pencil & paper to sketch ideas
  • Balsamic and Draw.io for diagraming and wireframes, 
  • Figma for mockups and prototypes, as well as WordPress and Photoshop.

Technology

REACT js, SQL, Lucene search engine, WordPress

Problem definition

In this project, understanding the problem space–the NEPA law and how it relates to social/environmental policy decisions–was critical to working with stakeholders, understanding specific user needs, and designing a user interface that addresses these understandings.

Background. In the 1960s, a series of high-profile environmental and social crises, such as Atlanta’s I-20 freeway construction that destroyed urban communities, the Santa Barbara oil spill, and the polluted Cuyahoga River fire made headlines that shocked the nation. This catalyzed the passing of the National Environmental Policy Act (NEPA), unanimously approved by Congress and signed into law by President Nixon on January 1, 1970.

The law. The core mandate of NEPA is to balance environmental protection with infrastructure development needs  by studying environmental impacts and fostering public comments in planning infrastructure projects.

What the law did. Since its inception in 1970, NEPA has continuously generated volumes of environmental impact statements (EISs), providing detailed scientific analyses of social and environmental impacts from projects. Nobody really knows how many of these reports were published.

What it didn’t do. However, despite the production of over 40,000 EISs comprising millions of pages, there was no centralized means in 1970 to store and access this crucial information. This means 50+ years of published science has been difficult or impossible to find.

Why it’s important. Large planning projects often face “wicked problems” the most complex and challenging issues of our time—problems that can’t be solved, due to multiple causes, conflicting viewpoints, and high uncertainty, like climate change, poverty, pollution.

Why we built this. Having access to knowledge generated by NEPA is crucial for informing complex policy decisions.

Enter NEPAccess. It’s a collaborative effort involving environmental and social scientists, legal and policy experts, computer scientists, students, developers, and designers at the University of Arizona. This initiative employs modern data science and AI tools to collect, categorize, and centralize these documents into a searchable platform. . 

Visualization of the problem of unstructured data for an explainer video
Even within a single project, the data was unorganized

Phase 1: Discovery

Stakeholder interviews

This was a specialized database with a complex history. It was essential to understand the problem from the perspective of the project leadership. These are top quotes:

Vision

“NEPA is an important environmental law and we know almost nothing about how it works in practice.”

“The other problem is that there’s a lot of valuable data contained within the documents that’s completely inaccessible.”

Pain points

“Just finding documents. Just the fact that they’re so huge…It can be really challenging to navigate them and get to the information that you need”

“NEPA users may not even know that they have those pain points, because people can’t really imagine it being any different.”

History of the problem

“In the past, you just didn’t know if you’re going to be able to find enough data to do your research. You didn’t know where to go to look for it. And it’s really just like throwing darts at a dartboard.”

The challenge

“Right now the biggest challenge is figuring out how people will want to use it at this point in time. Once we have that, then the biggest challenge will be: can we do what people want?”

How to define success?

“All these multiple user types should be able to sit down–and come away feeling that they’ve either got what they wanted to know, or with a bit more probing they will get what they wanted. I think that’s the big challenge”

Environmental planning issues are often “wicked problems,” –those having high complexity and uncertainty, with multiple causes, and conflicting viewpoints–essentially unsolvable. NEPA was supposed to help with this.

How people find NEPA documents now

To answer this, I did a search myself, taking a scenario from a citizen wanting to read the science behind a controversial mine whose name they heard in the local news. 

My hypothesis was that designing for the common citizen would help professional users as well. 

I started with a Google search. I was optimistic. How hard could this be? Google found an agency website with a search form for EISs.

PAUL: types Rosemont mine into the search box 

SYSTEM: No records met the search criteria.

How can that be? I know the document exists and NEPA requires it to be made public.

I found another link to a library site that had a large collection of EISs. I got the same (lack of) results. Did I do something wrong? I felt confused and anxious.
Surely a high-profile mining project like this would at least return a catalog record!

I wrote to the reference librarian in the library site sidebar. She responded, explaining that the document was on a CD ROM which I could request through inter-library loan. She did give me the full title to see if that helped in my search.

PAUL: pastes title into search box: “Final environmental impact statement for the Rosemont copper project: a proposed mining operation, Coronado National Forest Pima County, Arizona.” 

SYSTEM: No records met the search criteria.

PAUL: tried two words that were in the title: Rosemont mining


SYSTEM: No records met the search criteria.

PAUL: tried Rosemont copper.

Success! But, why?

Rosemont mining and Rosemont copper both occurred within the title. Apparently, search terms had to be a consecutive phrase within the title, like “Rosemont copper,” 

 I downloaded the document I found.

Phase 2: Uncovering user needs and goals

User testing

We didn’t know exactly how people would want to use this tool. I conducted usability walkthrough interviews concurrently with stages of UI development. 

Search is difficult to design. A search user interface (UI) must be simple. It has to translate computer logic into elements that are either intuitive to humans or easily learned using plain language. Because people are so used to Google searches and take that speed and invisible power for granted, we needed a high success rate. If people could not find relevant search results or got no results, the site loses credibility.

Research Questions

Context

What are the users’ jobs to be done

Analyze workflow

How do they currently do this? 

Find opportunities

What could be better about how they currently do this?

How will people use it?

What advanced search features will people use?

Will people search for a single document or to see related project documents together?

 

Computer skills

What is level of a persona’s domain knowledge (NEPA). Does this correlate with computer skills?

First screen

As a sole designer on a team of data scientists, I had to start with the technology they were already building and work towards the user experience.

I received the developer’s “skeleton screen” as a UX starting point. First, the main task was to replace “system language” with labels that humans understood, or else remove them.

The team cheered at the first demo: entering the search term Rosemont in the search box returned both draft and final reports for the Rosemont mine. It’s a start.

We used this testing server for the first user testing interviews.

 

Starting point: The developer's first screen. Note successful search for "Rosemont."

Question #1: Will people want advanced search?

It’s well documented that most users are confused by advanced search, yet team members wondered if users would need it for such specialized content. 

 

The test page showing the user all the information that the database could show.

Cognitive load. Search is a mentally intensive task. It is difficult for a user to think about their own questions and, at the same time, how the database works. I began to steer the team towards a minimalist search interface.

For testing, I created a “progressive disclosure” prototype in Figma to show/hide the advanced search.

First test 

When we developed full-text search, developers wondered if people would want the option of title-only search vs full document search text. 

The looks on peoples’ faces below express what we learned from this user test. Combining too many search options with the search box is confusing and becomes friction.

Six kinds of search for multiple words = too many options!

Annotations on screen above:   “AND” boolean search  2  “OR” boolean search  3  exact phrase search  4  full document text search  5  excludes certain  words.  6  searches various metadata.

 

People have a simplified mental model from habitual use of Google–where there’s just one kind of search. ” I just want to find what I’m looking for.”

Combined search

We learned that people didn’t need to distinguish between “title search” and “full text search.” That was developer’s thinking. We combined the two kinds of searches into one search box.
We learned that people were using Google for NEPA searches, and this was considered a frustrating work-around because Google was not designed for this kind of specialized domain.

Default search? “That’s useless,” said the lawyer on the interview screen. “And I can’t tell if this single document I can see is useful to me.”

Using the table format made it hard for this user to find related documents or the complex topic search he entered. 

Question #2: Will people search for single documents or will they want to see related documents together?

 Often NEPA documents change titles between versions. Developers created a “title similarity tool” on the details page to collect related files based on title similarity. 

 

People want to do both–they are both valid use cases.

The doomed title similarity sliderappeared on the details page for a document

Well, that didn’t work.

People played with the slider, but no one understood what the tool did. The developer wrote the copy: “A lower match percentage yields more results.”
What?

Let’s try cards, without the table

The table format was the developer’s default format–it’s like the spreadsheet the scientists on the team were used to, and easy for the system to output. But, a table would not work for grouping documents together. I introduced a card pattern that combined related documents in a single card. We found that most people wanted to search for a complete project, then look at individual files within that grouped project.

The pencil sketch on the left is exactly what I showed at a team meeting to sell the idea of combining related documents in a single card. 

Sketch and wireframe for new card and filters pattern

The eventual card layout grouping related documents and showing text snippets. 

Figma mockup of search results page grouping related documents within the project box they belong to.

New question:  Does our green color scheme look too environmentalist?

In the summer of 2022, a partisan divide stalled President Biden’s climate and energy bill, which should fund projects like NEPAccess that aimed to improve the NEPA process. 
It was finally passed as the “Inflation Reduction Act.” Communicating political neutrality was critical to making any change to current policy. We replaced the green palette with cool grays, and redesigned the visuals to focus on a neutral data science, light minimalist look

 

2021 green palette
2022 gray palette

Testimonials

Laura shared with me your findings based on the three interviews conducted thus far. Excellent overview and analysis of priority fixes! … First, we learned SO much from these interviews. Thank you so much for organizing them and thinking through the resulting changes that need to be made. 

I also wanted to support the recommendation that the programmers work on some of these high priority fixes before we schedule additional interviews in September… With some of these basic items fixed now, we can then learn more about how users are likely to dig more deeply into more complex searches.

–Kirk Emerson, Professor of Practice in Collaborative Governance at the University of Arizona School of Government and Public Policy

“I’m a hard grader and I’ve never, ever had an easier time finding NEPA documents. I was quickly able to find everything I needed.”

Betty Dehoney, President, National Association of Environmental Professionals (NAEP)

Metrics

User testing Participants: I interviewed over 30 people from 5 different personas or user groups.

In 2023, a year and a half after the site went public, and still in beta testing phase:

527 registered users in 9 user groups

2195 searches from around the USA

2,209 downloads (conversions)

Search success metrics. This requires more users and is a next step, but interrupted by a funding pause. 

Learning

What we learned

We learned in early project phases that people wanted to: 

  • Use filters or faceted search to narrow results rather than advanced search. This facilitates complex searches, browsing, and discovery.
  • Search full document text for relevant sections–since titles are so long and inconsistent, few people made title only searches.
  • Find all related documents comprising a single project grouped together ordered by date.
  • NEPA professionals wanted to do complex searches combining multiple topics. This can be done in the search box with modifier keystrokes and enables meta analysis–collecting ideas from many projects on a single topic. 

Professional NEPA users were more excited about the tool than we expected

What NEPA access offers me is the ability to quickly search to find those critical issues on the project I’m working on, find ways that others have dealt with it. And this is, by the way, the first time that something like this has ever been available, and it’s really a game changer… And I’m really excited to be using this on every project.” 

–Consultant, multinational infrastructure consulting firm

Once a person found a document that might be useful, they still had to search within the PDF to find the text that matched their query. 

Search is only half of the work. One they found a candidate and downloaded it, a person had to spend more mind-numbing brain time searching through hundreds of pages within the PDF to find the relevant sectioons.

This prompted our research into a Retrieval Augmented Generative (RAG) AI to find and summarize long sections of text.

User research engages early adopters, who build social proof within their networks.

According to the “law of diffusion of innovation,” (E.M. Rogers, 1962) there is a tipping point as new ideas diffuse through enough early adopters before they enter the mainstream. Our participants became early adopters.

Design must translate computer system models to human mental models of how things work

Nobody wanted to read technical language in the interface. One of my tasks was to replace system language with labels that everyone already understood. They didn’t want to think about how the system worked.

Search carries a lot of cognitive load

Search takes a lot of mental effort and EISs are among the most complex papers to write. If we could reduce the working memory demanded by the search interface, make it simpler and easy to learn, we can free up a user’s mental resources for the complex research problems they are trying to answer.

 Are they tech-savvy?

 We found “tech-savvy” to be a meaningless label. High domain knowledge does not mean high computer skills. When we designed for a common citizen, everyone benefited,  even domain experts like lawyers and agency professionals.

Conclusion

This project aimed to create an innovative search site that would make scientific knowledge more accessible to legal processes. By bridging the gap between science and the law, the tool could help streamline complex legal procedures, reduce costs, and ultimately lead to better-informed decisions on complex social and environmental issues.

Applying the power of design to this problem was a satisfying creative challenge that also contributed to a larger social benefit.

Future

As the project was put on hold in 2024 due to a lapse in federal funding, this new minimalist mockup, still still serves as a guide for future development. (created in Figma)