# WebCrawl

### Overview:

A web crawler, also known as a "spider" or "bot," is an automated program that systematically browses the internet to discover, index, and store information from webpages. Used mainly by search engines like Google, it builds a massive, searchable index of the web, helping users find relevant content quickly.

### Prerequisites:

* **An iX Hello Account** For steps to [create an iX Hello account,](https://app.gitbook.com/o/-M8Qw0HjmL3rDRYUXBX0/s/-M8XHvUsfyTUFLvToHqD/general/ix-hello-create-account) follow the steps at
* Use this link to [Sign up for iX Hello platform](https://bots.ixhello.com/Account/Login?ReturnUrl=%2F).
* To create an APP using Quick Start App refer to [this](https://app.gitbook.com/o/-M8Qw0HjmL3rDRYUXBX0/s/-M8XHvUsfyTUFLvToHqD/general/rag-file-upload/file-upload-in-studio-mode-1/quick-start-apps)

### **How to create an Assistant using WebCrawl in iX Hello:**

1. On the Home Screen, click on the Quick Start App Button.

<figure><img src="/files/2b154KSwmLdaKE86DMxt" alt="" width="563"><figcaption></figcaption></figure>

2. You will be navigated to Quick Start Apps screen and then click on "Add a New Assistant" button.

<figure><img src="/files/ZHJpM5dUJMuGPrUnfi8f" alt="" width="563"><figcaption></figcaption></figure>

3. You will see the Quick Start Apps Library, which helps you to create an app using ready to use AI apps.
4. To create a new assistant using the Add a New Assistant button and choose Quick RAG-App.

<figure><img src="/files/cUUGwq6FfiHTxMSplPCx" alt="" width="563"><figcaption></figcaption></figure>

5. Fill out the Configuration screen of General Information Tab.
6. The table below explains each of the inputs shown on this screen.

| Input                         | Information                                                                                                                                                                                                                                                                                                    |
| ----------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Name                          | <p>This is the display name of the assistant.</p><p> For Ex: WebCrawl Test                  </p>                                                                                                                                                                                                               |
| Description                   | <p>Explains what the assistant is designed to do.</p><p>For Ex: WebCrawl Test</p>                                                                                                                                                                                                                              |
| Welcome Message               | This is the first message the assistant sends when a conversation starts.                                                                                                                                                                                                                                      |
| Goodbye Message               | This is the closing message when the conversation ends.                                                                                                                                                                                                                                                        |
| Persona and Behavior (Sample) | <p>Defines the assistant’s tone, personality, and response style.</p><p>For Ex: I am a concise and focused assistant, providing accurate information based solely on the provided Content files from URL. I aim to deliver clear and relevant answers to your queries from Content files from URL provided</p> |

<figure><img src="/files/k5ukAYvcfnVeSNGDtuK1" alt="" width="563"><figcaption></figcaption></figure>

7. Fill out the Configuration screen of Knowledge Base Data Tab,

| Input                     | Information                                                                                                                                                                                                                                                                                                                                                                                               |
| ------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| SharePoint Upload Section | Displays files and folders from the selected SharePoint site.                                                                                                                                                                                                                                                                                                                                             |
| Instructional Text        | <p>Connect to SharePoint, browse folders/files, and select content for <strong>AI training</strong>.</p><ul><li><strong>Supported File Types</strong>: <code>.pdf, .docx, .csv, .xls, .xlsx, .txt, .ppt, .pptx, .gif, .jpg, .jpeg, .png, .webp</code>.</li><li><p><strong>Upload Limits</strong>:</p><ul><li>Max per file: <strong>200MB</strong></li><li>Total: <strong>1GB</strong></li></ul></li></ul> |
| Action Buttons            | Click Save Button to save all the information updated.                                                                                                                                                                                                                                                                                                                                                    |

<figure><img src="/files/RyVKSoFCrhtOo2s9baHp" alt="" width="563"><figcaption></figcaption></figure>

8. The bot created will appear under My Assistant and enable the PII Redaction by clicking on the warning "Click Here" and hit the Save button on the pop-up.
9. To know more on PII Redaction refer to [PII-Management](/ixhc/general/pii-management.md)

<figure><img src="/files/Up12GnP3bSKjJoe7OLaT" alt="" width="563"><figcaption></figcaption></figure>

<figure><img src="/files/3FLNwWBWTpQHuwzlPFRW" alt="" width="563"><figcaption></figcaption></figure>

10. Click on "Activate" button to activate the bot. You can click on "Chat Now" and start the conversation with the bot created.
11. Based on the content in the file, which is uploaded, you can query the bot accordingly:

<figure><img src="/files/73MwnF9VnHK8Vs3Jc8Rp" alt="" width="563"><figcaption></figcaption></figure>

12. You can access the same bot through Custom Apps.

{% hint style="info" %}
Using the Quick Start App to create this assistant ensures that all the necessary tabs in custom apps are automatically generated or updated. This includes the AI-Content tab, Methods Tab, Intents, Utterance, Slot, and Input.
{% endhint %}

<figure><img src="/files/NTWWI34h8NghREJyNeig" alt=""><figcaption></figcaption></figure>

13. The list of contents provided while creating the assistant are as below

<figure><img src="/files/H9mnx1NYmy9OHCbzAZJn" alt=""><figcaption></figcaption></figure>

**a. Preview Content**

* Let's you view the data that has been extracted or crawled from the specified URL.
* Helps verify that the correct content was fetched.
* Ensures the data is relevant and clean before using it for AI training.

**b. Refresh Content**

* Updates the stored content by re-fetching data from the original URL.
* Keeps training data current when the source website changes.
* Ensures the AI model uses the latest information.

**c. Delete**

* Removes the selected training URL and its associated content from the system.
* Cleans up outdated or irrelevant sources.
* Prevents unnecessary data from influencing the AI model.

### Let's test the assistant

<figure><img src="/files/JJXXCIJxgAJ7tDK7G8Zq" alt="" width="563"><figcaption></figcaption></figure>

When the user clicks on the provided links, the user will be able to view the content screen associated to the link provided with in the Chat Widget.

<figure><img src="/files/R7yGC3tEwXjh6LnVQLK2" alt="" width="563"><figcaption></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ixhello.com/ixhc/general/ix-hello-basic-mode/webcrawl.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
