
CustomGPT.ai
A knowledge-base website chatbot that feels unusually human and keeps follow-up context well, but it can still hallucinate support contacts.
Best tone in the test, with one serious production caveat
CustomGPT was the strongest overall performer in this benchmark for teams that want a customer-support chatbot to sound warm, personal, and human instead of robotic. It retrieved warranty, shipping, and returns information accurately across simple, medium, and complex questions, and it handled follow-up questions with especially strong context retention. The main risk is production-critical: in the frustration scenario, it hallucinated support contact details that were not in the knowledge base. That means its tone is excellent, but it still needs strict grounding controls before going live.
In-Depth Review
Our detailed analysis of CustomGPT.ai — features, performance, and real-world testing.
Feature-by-Feature Breakdown
Knowledge-base answer generationStrong factual retrieval across warranty, shipping, and returns content.▾
Feature tested: Knowledge-base answer generation
Result: Passed
Verdict: Strong factual retrieval across warranty, shipping, and returns content.
Expected behavior: Retrieves and summarizes support-policy information from uploaded knowledge-base documents for both direct and complex customer questions. In this test, it was exercised on warranty coverage, delivery timelines, and a non-returnable product with a manufacturing defect under warranty.
Test case: Text prompt → Image
Input type: Text prompt
Input used: Input artifact (Text prompt): Warranty coverage question
Observed output: Output artifact (Image): CustomGPT correctly answered the warranty coverage question by pulling directly from the uploaded policy document. It outlined what the warranty covers — includ — customgpt-input1-warranty-coverage-step1-initial-response.png
Input artifact: Input artifact (Text prompt): Warranty coverage question
Output artifact: Output artifact (Image): CustomGPT correctly answered the warranty coverage question by pulling directly from the uploaded policy document. It outlined what the warranty covers — includ — customgpt-input1-warranty-coverage-step1-initial-response.png
What changed: Text prompt transformed into Image
Test case: Text prompt → Image
Input type: Text prompt
Input used: Input artifact (Text prompt): Delivery timeline question
Observed output: Output artifact (Image): CustomGPT answered the delivery timeline question accurately by retrieving the information directly from the uploaded policy document. It broke down delivery ti — customgpt-input2-delivery-express-step1-initial-response-1.png
Input artifact: Input artifact (Text prompt): Delivery timeline question
Output artifact: Output artifact (Image): CustomGPT answered the delivery timeline question accurately by retrieving the information directly from the uploaded policy document. It broke down delivery ti — customgpt-input2-delivery-express-step1-initial-response-1.png
What changed: Text prompt transformed into Image
Test case: Text prompt → Image
Input type: Text prompt
Input used: Input artifact (Text prompt): Complex warranty + returns question
Observed output: Output artifact (Image): CustomGPT handled this complex cross-policy question well — correctly distinguishing between the return policy and the warranty policy. It clarified that while — customgpt-input3-nonreturnable-defect-step1-initial-response.png
Input artifact: Input artifact (Text prompt): Complex warranty + returns question
Output artifact: Output artifact (Image): CustomGPT handled this complex cross-policy question well — correctly distinguishing between the return policy and the warranty policy. It clarified that while — customgpt-input3-nonreturnable-defect-step1-initial-response.png
What changed: Text prompt transformed into Image
Why it matters / Conclusion: Retrieval quality was strong across simple, medium, and complex policy questions, with omissions more common than hallucination in standard knowledge-base answers.
Retrieves and summarizes support-policy information from uploaded knowledge-base documents for both direct and complex customer questions. In this test, it was exercised on warranty coverage, delivery timelines, and a non-returnable product with a manufacturing defect under warranty.

CustomGPT correctly answered the warranty coverage question by pulling directly from the uploaded policy document. It outlined what the warranty covers — including manufacturing defects and hardware failures — and cited the relevant policy section as a source, confirming the response was grounded in the knowledge base rather than a generic AI reply.

CustomGPT answered the delivery timeline question accurately by retrieving the information directly from the uploaded policy document. It broke down delivery timeframes by shipping type — standard and express — and cited the source section, confirming the answer was grounded in the knowledge base rather than a generic estimation.

CustomGPT handled this complex cross-policy question well — correctly distinguishing between the return policy and the warranty policy. It clarified that while the product is non-returnable, a manufacturing defect within the warranty period still qualifies for repair or replacement under the warranty clause. The response cited both the returns and warranty sections of the policy document, showing it can resolve questions that require reasoning across multiple policy areas simultaneously.
Conversational follow-up handlingOne of its strongest capabilities: it preserved context and handled nuanced follow-ups cleanly.▾
Feature tested: Conversational follow-up handling
Result: Passed
Verdict: One of its strongest capabilities: it preserved context and handled nuanced follow-ups cleanly.
Expected behavior: Maintains conversational context across turns and answers narrower follow-up questions without losing the original topic. This was tested on warranty nuance, shipping eligibility, and claim-outcome follow-ups.
Test case: Text prompt → Image
Input type: Text prompt
Input used: Input artifact (Text prompt): Warranty follow-up
Observed output: Output artifact (Image): On the follow-up 'Is battery degradation covered under warranty?', CustomGPT kept the warranty context and made the subtle but important distinction that normal — customgpt-battery-degradation-warranty-answer.png
Input artifact: Input artifact (Text prompt): Warranty follow-up
Output artifact: Output artifact (Image): On the follow-up 'Is battery degradation covered under warranty?', CustomGPT kept the warranty context and made the subtle but important distinction that normal — customgpt-battery-degradation-warranty-answer.png
What changed: Text prompt transformed into Image
Test case: Text prompt → Image
Input type: Text prompt
Input used: Input artifact (Text prompt): Shipping follow-up
Observed output: Output artifact (Image): On 'What about express delivery for remote areas?', it preserved the shipping context and answered directly that express delivery is not available for remote re — customgpt-express-delivery-remote-areas-policy.png
Input artifact: Input artifact (Text prompt): Shipping follow-up
Output artifact: Output artifact (Image): On 'What about express delivery for remote areas?', it preserved the shipping context and answered directly that express delivery is not available for remote re — customgpt-express-delivery-remote-areas-policy.png
What changed: Text prompt transformed into Image
Test case: Text prompt → Image
Input type: Text prompt
Input used: Input artifact (Text prompt): Claim outcome follow-up
Observed output: Output artifact (Image): On 'Will I get a refund or only a repair?', it correctly said repair or replacement is the usual warranty outcome and that refunds are possible but depend on ve — customgpt-warranty-refund-vs-repair-response.png
Input artifact: Input artifact (Text prompt): Claim outcome follow-up
Output artifact: Output artifact (Image): On 'Will I get a refund or only a repair?', it correctly said repair or replacement is the usual warranty outcome and that refunds are possible but depend on ve — customgpt-warranty-refund-vs-repair-response.png
What changed: Text prompt transformed into Image
Why it matters / Conclusion: Context retention was excellent across all tested follow-ups; the only notable miss was an over-narrow refund interpretation in the most complex thread.
Maintains conversational context across turns and answers narrower follow-up questions without losing the original topic. This was tested on warranty nuance, shipping eligibility, and claim-outcome follow-ups.

On the follow-up 'Is battery degradation covered under warranty?', CustomGPT kept the warranty context and made the subtle but important distinction that normal battery degradation is excluded, while charging failure or certified manufacturing defects in rechargeable batteries are covered for up to 6 months. The answer also showed a source reference below the reply.

On 'What about express delivery for remote areas?', it preserved the shipping context and answered directly that express delivery is not available for remote regions. It then added the relevant constraints: eligible pin codes in non-remote areas only, non-hazardous products, orders placed before 2:00 PM local time, and exclusions for oversized products, hazardous materials, and marketplace seller orders.

On 'Will I get a refund or only a repair?', it correctly said repair or replacement is the usual warranty outcome and that refunds are possible but depend on verification. It also mentioned original-payment-method refunds and that partial refunds or restocking fees can apply. The nuance was mostly strong, but the researcher noted the answer leaned too hard toward partial refunds even though the knowledge base also allows full refunds in some cases.
Source reference displayVisible source references were present in multiple answers.▾
Feature tested: Source reference display
Result: Passed
Verdict: Visible source references were present in multiple answers.
Expected behavior: Shows document-level source references beneath replies so users can see which knowledge-base file grounded the answer. This was observed on warranty, shipping, and refund-related follow-ups.
Test case: Text prompt → Image
Input type: Text prompt
Input used: Input artifact (Text prompt): Warranty citation check
Observed output: Output artifact (Image): The response included a 'Sources referenced in this response' panel pointing to novatech_warranty_coverage_guide.pdf, giving visible grounding rather than an un — customgpt-battery-degradation-warranty-answer.png
Input artifact: Input artifact (Text prompt): Warranty citation check
Output artifact: Output artifact (Image): The response included a 'Sources referenced in this response' panel pointing to novatech_warranty_coverage_guide.pdf, giving visible grounding rather than an un — customgpt-battery-degradation-warranty-answer.png
What changed: Text prompt transformed into Image
Test case: Text prompt → Image
Input type: Text prompt
Input used: Input artifact (Text prompt): Shipping citation check
Observed output: Output artifact (Image): The answer showed a source reference to NovaTech Delivery SLA Policy.pdf beneath the reply, making the shipping constraint traceable to the knowledge base. — customgpt-express-delivery-remote-areas-policy.png
Input artifact: Input artifact (Text prompt): Shipping citation check
Output artifact: Output artifact (Image): The answer showed a source reference to NovaTech Delivery SLA Policy.pdf beneath the reply, making the shipping constraint traceable to the knowledge base. — customgpt-express-delivery-remote-areas-policy.png
What changed: Text prompt transformed into Image
Test case: Text prompt → Image
Input type: Text prompt
Input used: Input artifact (Text prompt): Refund-policy citation check
Observed output: Output artifact (Image): The response displayed novatech_warranty_coverage_guide.pdf as a referenced source, showing that citation support remained visible even in a follow-up exchange. — customgpt-warranty-refund-vs-repair-response.png
Input artifact: Input artifact (Text prompt): Refund-policy citation check
Output artifact: Output artifact (Image): The response displayed novatech_warranty_coverage_guide.pdf as a referenced source, showing that citation support remained visible even in a follow-up exchange. — customgpt-warranty-refund-vs-repair-response.png
What changed: Text prompt transformed into Image
Why it matters / Conclusion: CustomGPT visibly cited source documents in multiple replies, though this test did not independently audit citation completeness for every answer.
Shows document-level source references beneath replies so users can see which knowledge-base file grounded the answer. This was observed on warranty, shipping, and refund-related follow-ups.

The response included a 'Sources referenced in this response' panel pointing to novatech_warranty_coverage_guide.pdf, giving visible grounding rather than an uncited answer.

The answer showed a source reference to NovaTech Delivery SLA Policy.pdf beneath the reply, making the shipping constraint traceable to the knowledge base.

The response displayed novatech_warranty_coverage_guide.pdf as a referenced source, showing that citation support remained visible even in a follow-up exchange.
Empathetic support responseBest conversational tone in the benchmark, but also the source of the test's biggest safety issue.▾
Feature tested: Empathetic support response
Result: Passed
Verdict: Best conversational tone in the benchmark, but also the source of the test's biggest safety issue.
Expected behavior: Adapts tone to the user's emotional state and responds more like a live support agent than a policy search tool. This was visible across normal policy questions and was stress-tested with an explicitly angry customer message.
Test case: Text prompt → Image
Input type: Text prompt
Input used: Input artifact (Text prompt): Frustration message
Observed output: Output artifact (Image): When given an angry message about receiving the wrong order three times, the bot immediately acknowledged the user's emotion, apologized, and invited the user t — customgpt-customer-support-apology-and-escalation.png
Input artifact: Input artifact (Text prompt): Frustration message
Output artifact: Output artifact (Image): When given an angry message about receiving the wrong order three times, the bot immediately acknowledged the user's emotion, apologized, and invited the user t — customgpt-customer-support-apology-and-escalation.png
What changed: Text prompt transformed into Image
Test case: Text prompt → Image
Input type: Text prompt
Input used: Input artifact (Text prompt): Frustration scenario safety check
Observed output: Output artifact (Image): CustomGPT handled the frustrated user message with empathy while staying on-policy. It acknowledged the repeated wrong orders, avoided engaging with the threat — customgpt-input4-frustration-step1-response.png
Input artifact: Input artifact (Text prompt): Frustration scenario safety check
Output artifact: Output artifact (Image): CustomGPT handled the frustrated user message with empathy while staying on-policy. It acknowledged the repeated wrong orders, avoided engaging with the threat — customgpt-input4-frustration-step1-response.png
What changed: Text prompt transformed into Image
Why it matters / Conclusion: CustomGPT had the most human-feeling support tone of any tool tested, but that same style also produced the report's most serious grounding failure: fabricated contact details.
Adapts tone to the user's emotional state and responds more like a live support agent than a policy search tool. This was visible across normal policy questions and was stress-tested with an explicitly angry customer message.

When given an angry message about receiving the wrong order three times, the bot immediately acknowledged the user's emotion, apologized, and invited the user to explain what went wrong so it could help with order status, complaint handling, or replacement steps. The empathy opener was the strongest observed in this benchmark.

CustomGPT handled the frustrated user message with empathy while staying on-policy. It acknowledged the repeated wrong orders, avoided engaging with the threat of breaking the PC, and redirected the conversation to the correct resolution path — reporting a wrong item with the required documentation. The response balanced emotional sensitivity with accurate policy guidance without skipping either.
Pricing & Access
Pricing verified June 2026. We re-check quarterly.
Is This Right For You?
A side-by-side guide based on our hands-on testing.
How it performed in this benchmark
Result for the tested use case: build a website chatbot that answers questions from your knowledge base.
Featured in Rankings
Independent rankings where CustomGPT.ai was tested and rated.
Banner Preview
How the embed badge will look on your site

Embed HTML
Copy this code to your website source
Quick Integration Guide
- 1Copy the HTML code block above.
- 2Paste it into your site's HTML or CMS editor.
- 3Banner appears instantly on your page.
- 4Links back to your tool profile here.
Similar Tools
Discover more AI tools like CustomGPT.ai to enhance your workflow.
