CustomGPT.ai icon
Business & Marketing

CustomGPT.ai

A knowledge-base website chatbot that feels unusually human and keeps follow-up context well, but it can still hallucinate support contacts.

Visit CustomGPT.ai
Strong empathyFollow-up context heldRagHallucinated contact info

Best tone in the test, with one serious production caveat

CustomGPT was the strongest overall performer in this benchmark for teams that want a customer-support chatbot to sound warm, personal, and human instead of robotic. It retrieved warranty, shipping, and returns information accurately across simple, medium, and complex questions, and it handled follow-up questions with especially strong context retention. The main risk is production-critical: in the frustration scenario, it hallucinated support contact details that were not in the knowledge base. That means its tone is excellent, but it still needs strict grounding controls before going live.

Hands-on CustomGPT test walkthrough.

In-Depth Review

Our detailed analysis of CustomGPT.ai — features, performance, and real-world testing.

R
Rugved
AI Demos Team
Verified Review

Feature-by-Feature Breakdown

Knowledge-base answer generation
Strong factual retrieval across warranty, shipping, and returns content.
Test Summary
Feature tested: Knowledge-base answer generation
Result: Passed — Strong factual retrieval across warranty, shipping, and returns content.

Feature tested: Knowledge-base answer generation

Result: Passed

Verdict: Strong factual retrieval across warranty, shipping, and returns content.

Expected behavior: Retrieves and summarizes support-policy information from uploaded knowledge-base documents for both direct and complex customer questions. In this test, it was exercised on warranty coverage, delivery timelines, and a non-returnable product with a manufacturing defect under warranty.

Test case: Text prompt → Image

Input type: Text prompt

Input used: Input artifact (Text prompt): Warranty coverage question

Observed output: Output artifact (Image): CustomGPT correctly answered the warranty coverage question by pulling directly from the uploaded policy document. It outlined what the warranty covers — includ — customgpt-input1-warranty-coverage-step1-initial-response.png

Input artifact: Input artifact (Text prompt): Warranty coverage question

Output artifact: Output artifact (Image): CustomGPT correctly answered the warranty coverage question by pulling directly from the uploaded policy document. It outlined what the warranty covers — includ — customgpt-input1-warranty-coverage-step1-initial-response.png

What changed: Text prompt transformed into Image

Test case: Text prompt → Image

Input type: Text prompt

Input used: Input artifact (Text prompt): Delivery timeline question

Observed output: Output artifact (Image): CustomGPT answered the delivery timeline question accurately by retrieving the information directly from the uploaded policy document. It broke down delivery ti — customgpt-input2-delivery-express-step1-initial-response-1.png

Input artifact: Input artifact (Text prompt): Delivery timeline question

Output artifact: Output artifact (Image): CustomGPT answered the delivery timeline question accurately by retrieving the information directly from the uploaded policy document. It broke down delivery ti — customgpt-input2-delivery-express-step1-initial-response-1.png

What changed: Text prompt transformed into Image

Test case: Text prompt → Image

Input type: Text prompt

Input used: Input artifact (Text prompt): Complex warranty + returns question

Observed output: Output artifact (Image): CustomGPT handled this complex cross-policy question well — correctly distinguishing between the return policy and the warranty policy. It clarified that while — customgpt-input3-nonreturnable-defect-step1-initial-response.png

Input artifact: Input artifact (Text prompt): Complex warranty + returns question

Output artifact: Output artifact (Image): CustomGPT handled this complex cross-policy question well — correctly distinguishing between the return policy and the warranty policy. It clarified that while — customgpt-input3-nonreturnable-defect-step1-initial-response.png

What changed: Text prompt transformed into Image

Why it matters / Conclusion: Retrieval quality was strong across simple, medium, and complex policy questions, with omissions more common than hallucination in standard knowledge-base answers.

Retrieves and summarizes support-policy information from uploaded knowledge-base documents for both direct and complex customer questions. In this test, it was exercised on warranty coverage, delivery timelines, and a non-returnable product with a manufacturing defect under warranty.

INPUT
What does the warranty cover?
Image
Output artifact for "Knowledge-base answer generation" test: CustomGPT correctly answered the warranty coverage question by pulling directly from the uploaded policy document. It outlined what the warranty covers — includ, customgpt-input1-warranty-coverage-step1-initial-response.png

CustomGPT correctly answered the warranty coverage question by pulling directly from the uploaded policy document. It outlined what the warranty covers — including manufacturing defects and hardware failures — and cited the relevant policy section as a source, confirming the response was grounded in the knowledge base rather than a generic AI reply.

INPUT
How long does delivery take?
Image
Output artifact for "Knowledge-base answer generation" test: CustomGPT answered the delivery timeline question accurately by retrieving the information directly from the uploaded policy document. It broke down delivery ti, customgpt-input2-delivery-express-step1-initial-response-1.png

CustomGPT answered the delivery timeline question accurately by retrieving the information directly from the uploaded policy document. It broke down delivery timeframes by shipping type — standard and express — and cited the source section, confirming the answer was grounded in the knowledge base rather than a generic estimation.

INPUT
If my product is non-returnable but develops a manufacturing defect within warranty, what options do I have?
Image
Output artifact for "Knowledge-base answer generation" test: CustomGPT handled this complex cross-policy question well — correctly distinguishing between the return policy and the warranty policy. It clarified that while, customgpt-input3-nonreturnable-defect-step1-initial-response.png

CustomGPT handled this complex cross-policy question well — correctly distinguishing between the return policy and the warranty policy. It clarified that while the product is non-returnable, a manufacturing defect within the warranty period still qualifies for repair or replacement under the warranty clause. The response cited both the returns and warranty sections of the policy document, showing it can resolve questions that require reasoning across multiple policy areas simultaneously.

Bottom Line
Retrieval quality was strong across simple, medium, and complex policy questions, with omissions more common than hallucination in standard knowledge-base answers.
Conversational follow-up handling
One of its strongest capabilities: it preserved context and handled nuanced follow-ups cleanly.
Test Summary
Feature tested: Conversational follow-up handling
Result: Passed — One of its strongest capabilities: it preserved context and handled nuanced follow-ups cleanly.

Feature tested: Conversational follow-up handling

Result: Passed

Verdict: One of its strongest capabilities: it preserved context and handled nuanced follow-ups cleanly.

Expected behavior: Maintains conversational context across turns and answers narrower follow-up questions without losing the original topic. This was tested on warranty nuance, shipping eligibility, and claim-outcome follow-ups.

Test case: Text prompt → Image

Input type: Text prompt

Input used: Input artifact (Text prompt): Warranty follow-up

Observed output: Output artifact (Image): On the follow-up 'Is battery degradation covered under warranty?', CustomGPT kept the warranty context and made the subtle but important distinction that normal — customgpt-battery-degradation-warranty-answer.png

Input artifact: Input artifact (Text prompt): Warranty follow-up

Output artifact: Output artifact (Image): On the follow-up 'Is battery degradation covered under warranty?', CustomGPT kept the warranty context and made the subtle but important distinction that normal — customgpt-battery-degradation-warranty-answer.png

What changed: Text prompt transformed into Image

Test case: Text prompt → Image

Input type: Text prompt

Input used: Input artifact (Text prompt): Shipping follow-up

Observed output: Output artifact (Image): On 'What about express delivery for remote areas?', it preserved the shipping context and answered directly that express delivery is not available for remote re — customgpt-express-delivery-remote-areas-policy.png

Input artifact: Input artifact (Text prompt): Shipping follow-up

Output artifact: Output artifact (Image): On 'What about express delivery for remote areas?', it preserved the shipping context and answered directly that express delivery is not available for remote re — customgpt-express-delivery-remote-areas-policy.png

What changed: Text prompt transformed into Image

Test case: Text prompt → Image

Input type: Text prompt

Input used: Input artifact (Text prompt): Claim outcome follow-up

Observed output: Output artifact (Image): On 'Will I get a refund or only a repair?', it correctly said repair or replacement is the usual warranty outcome and that refunds are possible but depend on ve — customgpt-warranty-refund-vs-repair-response.png

Input artifact: Input artifact (Text prompt): Claim outcome follow-up

Output artifact: Output artifact (Image): On 'Will I get a refund or only a repair?', it correctly said repair or replacement is the usual warranty outcome and that refunds are possible but depend on ve — customgpt-warranty-refund-vs-repair-response.png

What changed: Text prompt transformed into Image

Why it matters / Conclusion: Context retention was excellent across all tested follow-ups; the only notable miss was an over-narrow refund interpretation in the most complex thread.

Maintains conversational context across turns and answers narrower follow-up questions without losing the original topic. This was tested on warranty nuance, shipping eligibility, and claim-outcome follow-ups.

INPUT
Is battery degradation covered under warranty?
image
Output artifact for "Conversational follow-up handling" test: On the follow-up 'Is battery degradation covered under warranty?', CustomGPT kept the warranty context and made the subtle but important distinction that normal, customgpt-battery-degradation-warranty-answer.png

On the follow-up 'Is battery degradation covered under warranty?', CustomGPT kept the warranty context and made the subtle but important distinction that normal battery degradation is excluded, while charging failure or certified manufacturing defects in rechargeable batteries are covered for up to 6 months. The answer also showed a source reference below the reply.

INPUT
What about express delivery for remote areas?
image
Output artifact for "Conversational follow-up handling" test: On 'What about express delivery for remote areas?', it preserved the shipping context and answered directly that express delivery is not available for remote re, customgpt-express-delivery-remote-areas-policy.png

On 'What about express delivery for remote areas?', it preserved the shipping context and answered directly that express delivery is not available for remote regions. It then added the relevant constraints: eligible pin codes in non-remote areas only, non-hazardous products, orders placed before 2:00 PM local time, and exclusions for oversized products, hazardous materials, and marketplace seller orders.

INPUT
Will I get a refund or only a repair?
image
Output artifact for "Conversational follow-up handling" test: On 'Will I get a refund or only a repair?', it correctly said repair or replacement is the usual warranty outcome and that refunds are possible but depend on ve, customgpt-warranty-refund-vs-repair-response.png

On 'Will I get a refund or only a repair?', it correctly said repair or replacement is the usual warranty outcome and that refunds are possible but depend on verification. It also mentioned original-payment-method refunds and that partial refunds or restocking fees can apply. The nuance was mostly strong, but the researcher noted the answer leaned too hard toward partial refunds even though the knowledge base also allows full refunds in some cases.

Bottom Line
Context retention was excellent across all tested follow-ups; the only notable miss was an over-narrow refund interpretation in the most complex thread.
Source reference display
Visible source references were present in multiple answers.
Test Summary
Feature tested: Source reference display
Result: Passed — Visible source references were present in multiple answers.

Feature tested: Source reference display

Result: Passed

Verdict: Visible source references were present in multiple answers.

Expected behavior: Shows document-level source references beneath replies so users can see which knowledge-base file grounded the answer. This was observed on warranty, shipping, and refund-related follow-ups.

Test case: Text prompt → Image

Input type: Text prompt

Input used: Input artifact (Text prompt): Warranty citation check

Observed output: Output artifact (Image): The response included a 'Sources referenced in this response' panel pointing to novatech_warranty_coverage_guide.pdf, giving visible grounding rather than an un — customgpt-battery-degradation-warranty-answer.png

Input artifact: Input artifact (Text prompt): Warranty citation check

Output artifact: Output artifact (Image): The response included a 'Sources referenced in this response' panel pointing to novatech_warranty_coverage_guide.pdf, giving visible grounding rather than an un — customgpt-battery-degradation-warranty-answer.png

What changed: Text prompt transformed into Image

Test case: Text prompt → Image

Input type: Text prompt

Input used: Input artifact (Text prompt): Shipping citation check

Observed output: Output artifact (Image): The answer showed a source reference to NovaTech Delivery SLA Policy.pdf beneath the reply, making the shipping constraint traceable to the knowledge base. — customgpt-express-delivery-remote-areas-policy.png

Input artifact: Input artifact (Text prompt): Shipping citation check

Output artifact: Output artifact (Image): The answer showed a source reference to NovaTech Delivery SLA Policy.pdf beneath the reply, making the shipping constraint traceable to the knowledge base. — customgpt-express-delivery-remote-areas-policy.png

What changed: Text prompt transformed into Image

Test case: Text prompt → Image

Input type: Text prompt

Input used: Input artifact (Text prompt): Refund-policy citation check

Observed output: Output artifact (Image): The response displayed novatech_warranty_coverage_guide.pdf as a referenced source, showing that citation support remained visible even in a follow-up exchange. — customgpt-warranty-refund-vs-repair-response.png

Input artifact: Input artifact (Text prompt): Refund-policy citation check

Output artifact: Output artifact (Image): The response displayed novatech_warranty_coverage_guide.pdf as a referenced source, showing that citation support remained visible even in a follow-up exchange. — customgpt-warranty-refund-vs-repair-response.png

What changed: Text prompt transformed into Image

Why it matters / Conclusion: CustomGPT visibly cited source documents in multiple replies, though this test did not independently audit citation completeness for every answer.

Shows document-level source references beneath replies so users can see which knowledge-base file grounded the answer. This was observed on warranty, shipping, and refund-related follow-ups.

INPUT
Is battery degradation covered under warranty?
image
Output artifact for "Source reference display" test: The response included a 'Sources referenced in this response' panel pointing to novatech_warranty_coverage_guide.pdf, giving visible grounding rather than an un, customgpt-battery-degradation-warranty-answer.png

The response included a 'Sources referenced in this response' panel pointing to novatech_warranty_coverage_guide.pdf, giving visible grounding rather than an uncited answer.

INPUT
What about express delivery for remote areas?
image
Output artifact for "Source reference display" test: The answer showed a source reference to NovaTech Delivery SLA Policy.pdf beneath the reply, making the shipping constraint traceable to the knowledge base., customgpt-express-delivery-remote-areas-policy.png

The answer showed a source reference to NovaTech Delivery SLA Policy.pdf beneath the reply, making the shipping constraint traceable to the knowledge base.

INPUT
Will I get a refund or only a repair?
image
Output artifact for "Source reference display" test: The response displayed novatech_warranty_coverage_guide.pdf as a referenced source, showing that citation support remained visible even in a follow-up exchange., customgpt-warranty-refund-vs-repair-response.png

The response displayed novatech_warranty_coverage_guide.pdf as a referenced source, showing that citation support remained visible even in a follow-up exchange.

Bottom Line
CustomGPT visibly cited source documents in multiple replies, though this test did not independently audit citation completeness for every answer.
Empathetic support response
Best conversational tone in the benchmark, but also the source of the test's biggest safety issue.
Test Summary
Feature tested: Empathetic support response
Result: Passed — Best conversational tone in the benchmark, but also the source of the test's biggest safety issue.

Feature tested: Empathetic support response

Result: Passed

Verdict: Best conversational tone in the benchmark, but also the source of the test's biggest safety issue.

Expected behavior: Adapts tone to the user's emotional state and responds more like a live support agent than a policy search tool. This was visible across normal policy questions and was stress-tested with an explicitly angry customer message.

Test case: Text prompt → Image

Input type: Text prompt

Input used: Input artifact (Text prompt): Frustration message

Observed output: Output artifact (Image): When given an angry message about receiving the wrong order three times, the bot immediately acknowledged the user's emotion, apologized, and invited the user t — customgpt-customer-support-apology-and-escalation.png

Input artifact: Input artifact (Text prompt): Frustration message

Output artifact: Output artifact (Image): When given an angry message about receiving the wrong order three times, the bot immediately acknowledged the user's emotion, apologized, and invited the user t — customgpt-customer-support-apology-and-escalation.png

What changed: Text prompt transformed into Image

Test case: Text prompt → Image

Input type: Text prompt

Input used: Input artifact (Text prompt): Frustration scenario safety check

Observed output: Output artifact (Image): CustomGPT handled the frustrated user message with empathy while staying on-policy. It acknowledged the repeated wrong orders, avoided engaging with the threat — customgpt-input4-frustration-step1-response.png

Input artifact: Input artifact (Text prompt): Frustration scenario safety check

Output artifact: Output artifact (Image): CustomGPT handled the frustrated user message with empathy while staying on-policy. It acknowledged the repeated wrong orders, avoided engaging with the threat — customgpt-input4-frustration-step1-response.png

What changed: Text prompt transformed into Image

Why it matters / Conclusion: CustomGPT had the most human-feeling support tone of any tool tested, but that same style also produced the report's most serious grounding failure: fabricated contact details.

Adapts tone to the user's emotional state and responds more like a live support agent than a policy search tool. This was visible across normal policy questions and was stress-tested with an explicitly angry customer message.

INPUT
I am so frustrated! This is the 3rd time my order has been wrong. I feel like breaking my PC right now!
image
Output artifact for "Empathetic support response" test: When given an angry message about receiving the wrong order three times, the bot immediately acknowledged the user's emotion, apologized, and invited the user t, customgpt-customer-support-apology-and-escalation.png

When given an angry message about receiving the wrong order three times, the bot immediately acknowledged the user's emotion, apologized, and invited the user to explain what went wrong so it could help with order status, complaint handling, or replacement steps. The empathy opener was the strongest observed in this benchmark.

INPUT
I am so frustrated! This is the 3rd time my order has been wrong. I feel like breaking my PC right now!
Image
Output artifact for "Empathetic support response" test: CustomGPT handled the frustrated user message with empathy while staying on-policy. It acknowledged the repeated wrong orders, avoided engaging with the threat, customgpt-input4-frustration-step1-response.png

CustomGPT handled the frustrated user message with empathy while staying on-policy. It acknowledged the repeated wrong orders, avoided engaging with the threat of breaking the PC, and redirected the conversation to the correct resolution path — reporting a wrong item with the required documentation. The response balanced emotional sensitivity with accurate policy guidance without skipping either.

Bottom Line
CustomGPT had the most human-feeling support tone of any tool tested, but that same style also produced the report's most serious grounding failure: fabricated contact details.

Pricing & Access

Standard
$99/mo (or $89/mo billed annually)
10 custom AI agents, 5,000 pages/chatbot, 60M words content, 1,000 queries/month, 3 team members, basic analytics, OpenAI API access, helpdesk support. 7-day free trial available.
Premium
$499/mo (or $449/mo billed annually)
100 custom AI agents, 20,000 items/chatbot, 300M words stored, 5,000 GPT-4 queries/month, 5 team members, remove CustomGPT branding, PII removal, OCR support, 1-on-1 support.
Enterprise
Custom pricing
Custom chatbot & token limits, SSO, audit logs, dedicated account manager, call & email support, custom SLAs, SOC 2 Type II certified, HIPAA compliant. Annual commitment required.

Pricing verified June 2026. We re-check quarterly.

Is This Right For You?

A side-by-side guide based on our hands-on testing.

✓ Use This If
You want a customer-facing website chatbot that sounds warm, personal, and more like a real support agent than a policy lookup tool.
You need strong retrieval on support-policy content such as warranty, shipping, and returns questions.
You care about follow-up context retention across multi-turn customer-service conversations.
Visible source references matter to your workflow, even if you are not expecting fully audited citation behavior in every reply.
✕ Skip This If
You need strict no-hallucination behavior for support contact details or escalation instructions.
You prefer short, utilitarian answers over longer empathetic replies with repeated follow-up prompts.
You need strong evidence on deployment setup, widget customization, or embed implementation from the test itself; those areas were not deeply evaluated here.

How it performed in this benchmark

Result for the tested use case: build a website chatbot that answers questions from your knowledge base.

#1
Build a Website Chatbot That Answers Questions From Your Knowledge Base
Strongest performer tested so far for this use case thanks to accurate retrieval, excellent follow-up handling, and standout empathy. Main caveat: it hallucinated support contact details in the frustration scenario.
Business & MarketingCustomer Support ChatbotstextMarketingFounders
It performed strongly on direct policy retrieval. In this test, it accurately answered warranty coverage, shipping timelines, and a complex non-returnable defect question. The more common issue was omission of some available details, such as international warranty variation or Premium-member delay compensation, rather than generic hallucination in standard policy answers.
Yes. It retained context well across all three follow-up chains tested. It correctly handled battery degradation as a warranty nuance, confirmed that express delivery is unavailable for remote areas without losing the shipping thread, and answered the refund-versus-repair follow-up within the original warranty-claim context.
Yes. Several responses displayed a visible 'Sources referenced in this response' section tied to specific files such as novatech_warranty_coverage_guide.pdf and NovaTech Delivery SLA Policy.pdf.
The biggest issue was in the angry-customer scenario: CustomGPT hallucinated a phone number and support email that were not in the knowledge base. For a customer-support chatbot, that is a production-critical grounding failure even though the rest of the response was highly empathetic.
Warm customer experience. It was the most conversational and empathetic tool tested, but that also meant some answers became long and its follow-up prompts sometimes felt formulaic or slightly off-topic.
Not deeply. The report included a tool demo video and the screenshots showed deploy prompts, but the hands-on findings focused on retrieval quality, follow-up handling, citations, and tone rather than a detailed widget-setup or customization evaluation.

Banner Preview

How the embed badge will look on your site

CustomGPT.ai featured on AI Demos

Embed HTML

Copy this code to your website source

<a target="_blank" href="https://aidemos.com/tools/customgpt?utm_source=customgpt_embed" style="width: 250px; height: 80px; border-radius:4px;" width="250" height="80"> <img src="https://aidemos-website-images.s3.amazonaws.com/featured.png" alt="CustomGPT.ai | Featured on AI Demos" style="width: 250px; height: 80px; border-radius:4px;" width="250" height="80"> </a>

Quick Integration Guide

  • 1Copy the HTML code block above.
  • 2Paste it into your site's HTML or CMS editor.
  • 3Banner appears instantly on your page.
  • 4Links back to your tool profile here.
Similar Tools

Similar Tools

Discover more AI tools like CustomGPT.ai to enhance your workflow.

Comments (0)

Please Log in to join the discussion.

Back to Top