UI automation tests play a crucial role in ensuring the quality of mobile applications. Despite the growing popularity of machine learning techniques to generate these tests, they still face several challenges, such as the mismatch of UI elements. The recent advances in Large Language Models (LLMs) have addressed these issues by leveraging their semantic understanding capabilities. However, a significant gap remains in applying these models to industrial-level app testing, particularly in terms of cost optimization and knowledge limitation. To address this, we introduce CAT to create cost-effective UI automation tests for industry apps by combining machine learning and LLMs with best practices. Given the task description, CAT employs Retrieval Augmented Generation (RAG) to source examples of industrial app usage as the few-shot learning context, assisting LLMs in generating the specific sequence of actions. CAT then employs machine learning techniques, with LLMs serving as a complementary optimizer, to map the target element on the UI screen. Our evaluations on the WeChat testing dataset demonstrate the CAT's performance and cost-effectiveness, achieving 90% UI automation with $0.34 cost, outperforming the state-of-the-art. We have also integrated our approach into the real-world WeChat testing platform, demonstrating its usefulness in detecting 141 bugs and enhancing the developers' testing process.

Formats available

You can view the full content in the following formats:

PDF

References

[1]

2024. Android Debug Bridge (adb) - Android Developers. https://developer.android.com/studio/command-line/adb.

Google Scholar

[2]

2024. Android Uiautomator2 Python Wrapper. https://github.com/openatx/uiautomator2.

Google Scholar

[3]

2024. Developers warned: GitHub Copilot code may be licensed. https://www.techtarget.com/searchsoftwarequality/news/252526359/Developers-warned-GitHub-Copilot-code-may-be-licensed.

Google Scholar

[4]

2024. Genymotion - Android Emulator for app testing. https://www.genymotion.com/.

Google Scholar

Cited By

View all

Feng SDu CLiu HWang QLv ZHuo GYang XChen C(2025)Agent for User: Testing Multi - User Interactive Features in TikTok2025 IEEE/ACM 47th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)10.1109/ICSE-SEIP66354.2025.00011(57-68)Online publication date: 27-Apr-2025
https://doi.org/10.1109/ICSE-SEIP66354.2025.00011

Index Terms

Enabling Cost-Effective UI Automation Testing with Retrieval-Based LLMs: A Case Study in WeChat
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Make LLM a Testing Expert: Bringing Human-like Interaction to Mobile GUI Testing via Functionality-aware Decisions
ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering

Automated Graphical User Interface (GUI) testing plays a crucial role in ensuring app quality, especially as mobile applications have become an integral part of our daily lives. Despite the growing popularity of learning-based techniques in automated GUI ...
Pro Android UI
GenUI(ne) CRS: UI Elements and Retrieval-Augmented Generation in Conversational Recommender Systems with LLMs
RecSys '24: Proceedings of the 18th ACM Conference on Recommender Systems

Previous research has used Large Language Models (LLMs) to develop personalized Conversational Recommender Systems (CRS) with text-based user interfaces (UIs). However, the potential of LLMs to generate interactive graphical elements that enhance user ...

Abstract

Formats available

References

Cited By

Index Terms

Recommendations

Make LLM a Testing Expert: Bringing Human-like Interaction to Mobile GUI Testing via Functionality-aware Decisions

Pro Android UI

GenUI(ne) CRS: UI Elements and Retrieval-Augmented Generation in Conversational Recommender Systems with LLMs

Comments

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Upcoming Conference

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF

eReader

Abstract

Formats available

References

Cited By

Index Terms

Recommendations

Make LLM a Testing Expert: Bringing Human-like Interaction to Mobile GUI Testing via Functionality-aware Decisions

Pro Android UI

GenUI(ne) CRS: UI Elements and Retrieval-Augmented Generation in Conversational Recommender Systems with LLMs

Comments

Affiliations