Creating collections

This article is part of To collect, transform, and publish data:

2. Create a collection

This article explains the process for configuring HPE Consumption Analytics Portal to collect data from any data source. Each collection you create specifies which data to retrieve, how to map it to HPE Consumption Analytics Portal, and other settings, and is associated to one data source. For example, if you have three VMware vSphere systems, you would create separate collections for each of them. Or you might create two collections that collect different things from the same data source at different times.

For information on the specific settings you'll work with for a specific type of data source such as vSphere or Microsoft Azure, see the documentation for your data source type in the list on Collecting, transforming, and publishing.

To create a collection

  1. In an open ETL workbook, click Workbook > Collections > New (New Collections icon from ribbon.png) in the ribbon.
    ​The list of available collectors appears.
  2. Click the collector to use for this collection.
    You can filter the list by clicking a collector type on the left. When you choose a collector, the Create Collection wizard appears. You use the three tabs of this wizard to specify basic properties for the collection.
  3. In the Name and Source tab, enter a name for the collection in the Collection Name field.
    The name must be unique within the current workbook. You are prevented from typing characters that aren't allowed in a collection name, such as spaces.
    Leave Active set to Yes for now. You can deactivate the collection later if necessary.
  4. Select or create a data source for this collection to collect from by doing one of the following:
    • From the Data Source list, select an existing data source.
    • Click Create New to begin creating a new data source and do the following:
      1. Fill out the fields in the Define New Data Source area.
        For general information about data sources, see Data sources. For information about the specific fields in this data source, see the "Data source" article for your collector under Collecting, transforming, and publishing.
      2. (optional) Click Test to test the data source parameters, then make corrections if the test fails.
      3. Click Apply to save the data source.
  5. Click Next to move to the General Properties tab of the wizard, then fill out or set the following fields and options:
    • Data Sample Limit: The maximum number of records shown during workbook simulation
    • Exception Limit: The maximum number of exceptions to allow before stopping the collection with an error
    • Exception Log Limit: The maximum number of exceptions for which the collection will write exception records. Setting this number higher than Exception Limit results in the latter being the maximum number of exception records.
    • Keep Hourly Output: (version 4.0.1 and later) Yes creates one output file for each run of the collection, and is intended for collections that are scheduled to run more than once a day, such as hourly. No creates one output file per select date, so that running the collection again for the same date overwrites the previous file.
    • Single Feed Per Day: (version 4.0) This is the previous version of the Keep Hourly Output option, but the values are reversed: Yes creates one output file per select date.
    • Multiple Feeds: (version 4.1 and later) Whether to sort the CC Records for this collection into multiple feeds, enabling you to process each in a separate flow. Each feed is a separate directory under the process directory where collected CC Records are first placed. You can choose individual feeds when importing collections into a flow.
      ​When this option is set to No, all CC Records for the collection are written to a feed with the same name as the collection. When set to Yes, each collected CC Record is written to the feed identified by the value of its Feed Dimension.
    • Feed Dimension: When Multiple Feeds is true, the dimension whose value to use as the name of the feed a record is written to. For example, if Feed Dimension is set to siteName, then a record where the value of the siteName dimension is London is written to the London feed.
  6. Click Next to move to the <Collector> Properties tab and set the properties there.
    For information about the specific fields in this tab, see the "Collector properties" article for your collector under Collecting, transforming, and publishing.
    Most native collectors include templates that you can use to quickly set all of these properties. To use one, click Select, then select a template, then click Apply.
  7. Click Advanced to open the Advanced Configuration dialog box, then create the configuration that your collection requires.
    The collector reads from your data source based on the basic configuration you just specified, then creates a default configuration and displays the initial results so that you can work visually.

    The primary task you perform here is to specify the individual data items to collect from the data source and map those items to CC Record output. For information about using the Output Fields tab to map data to CC Records, see Output fields for a collection. For information about other tabs, fields, and controls that vary by collector, see the "Advanced configuration" article for your collector under Collecting, transforming, and publishing.

    The following screenshot and callouts provide tips for using the Advanced Configuration dialog box to configure any collection:

    Advanced Configuration dialog box with numbered callouts
    (Click image to enlarge)


    1. Click this icon to maximize the Advanced Configuration dialog box within the browser window or to return it to its previous size. Maximizing enables you to see more input and output data.
    2. Click this icon to see a current list of errors and warnings for the collection.
    3. Click Reset to Default to reset everything on the Output Fields tab and some fields and options on other tabs. This action returns these items to the state they were in when the collection was first created, including deleting items you added and reestablishing items you deleted, and then refreshes the grid.
    4. Click Refresh Grid to update the Input Data and Output Data panes by retrieving a new set of sample data and applying output mappings based on the current settings. Use this button to see how changes you've made affect the output of the collection.
    5. Select Show Advanced to show advanced columns in the Fields table. For information about these fields, see Output fields for a collection.
    6. Click Reset Fields to reset all fields in the Output Fields tab. This includes the Fields table and other fields that vary by collector, including other tables in the same pane and fields in other panes above. This action returns these items to the state they were in when the collection was first created, including deleting items you added and reestablishing items you deleted, and then refreshes the grid.
    7. View Input Data to see sample data retrieved from your data source in its raw format, such as XML or JSON. Right-click in this pane to access a context menu offering ways to customize its appearance. The Data Samples menu shows the timestamp of the sample dataset, but does not offer choices to select.
    8. View Output Data to see the CC Records generated from the last retrieved set of sample data. These are displayed the same way as in a worksheet, including the ability to resize columns. Hover your mouse over a measure name to see whether it represents a measure's value, cost, interval, or rate. Right-click in this pane to access a context menu offering ways to customize its appearance and other actions including exporting the data to a plain text document from which you can copy and paste it to other applications.
    9. Use the Data Layout setting to display the Input Data and Output Data panes either horizontally or vertically, whichever gives you the best view of your data.
  8. Click OK to close the Advanced Configuration dialog box, then click Finish to close the Create Collection wizard.
  9. In the ribbon, click Workbook > Save to save the workbook with your new collection.

What to do next

To collect, transform, and publish data:

