Cloud Cruiser became HPE Consumption Analytics on Nov. 1, 2018. You'll still see the old name in places while we update this site.

 

 

Consumption Analytics Documentation

Home > Documentation for older versions > Cloud Cruiser 4 for HPE GreenLake Flex Capacity > Cloud Cruiser architecture and workflow > Data collection and general processing

Data collection and general processing

This topic provides information about how Cloud Cruiser collects and processes data.

Unless otherwise stated, Cloud Cruiser operates on UTC time. Keep this in mind when configuring remote Cloud Cruiser servers (on-premise installations of Cloud Cruiser that are used to collect data for transmission back to the central Cloud Cruiser Portal). Configure remote servers to UTC time.

Metering scripts

Metering scripts are installed on FC customers' premises. These scripts are responsible for interrogating the equipment for the metrics and identifiers that will be used for billing. Scripts typically run daily, but if necessary they can be run multiple times per day.

The output of a metering script can be any of the following formats:

  • csv
  • xml
  • json
  • ccr (Cloud Cruiser Record format)

HPE is responsible for installing, scheduling, and maintaining metering scripts on FC customers' equipment.

Usage upload

Email is the current mechanism for transmitting data to Cloud Cruiser. Emails must be directed to the fcs-usage@hp.com email address for automated processing. To be processed and received by the automated download script, email must be received no later than 03:30 UTC time. Email not received by the cut-off time will not be included in nightly processing. (The automated data interpolation feature will cut in.)

Metering data files must be attached as standard email attachments.

Attachment file names

The name of the attached data file must use the following format:
<category>_<customer_alias>_<ip>_<rsvd>_<timestamp>.<extension>

  • <category>--Can be SAN-Switches, 3PAR, BL-Servers, or IaaS-Server  (both are accepted as Servers)
  • <customer_alias>--A unique ID that identifies the source of the data
  • <ip>--The IP address of the device.
  • <rsvd>--This is not used by this implementation of the Cloud Cruiser Portal
  • <timestamp>--The date and time the snapshot was taken, in YYYYMMDDHHMM format
  • <extension>--Can be zip, tgz, or enc

For example, the following file names are formatted correctly:

  • SAN-Switches_HP-EMEA-GWE-SWE-00034_10.254.212.13_10000027F824D9BE_201506021945.xml.zip
  • IaaS-Server_WEDOSInternet_10.28.25.60_CZ2504086T_201506021900.xml.zip

Attachment content

This implementation supports several file encodings and compression options.

  • Compressed zip files may contain one or more enc, xmlcsv, xml (or other usage files), or other zip files.
  • Compressed tgz files that, when uncompressed, produce expanded tar files. The tar files may contain zip, enc, xml, csv, or other usage files.
  • Encrypted enc files, which may contain xml, csv, or other usage files.

Usage retrieval

Processing

Processing begins by scanning all email received in the 24 hour window preceding the processing date. For example, if downloading data for 2015-05-25, the routine searches for all email received since 2015-05-24 00:00:00 UTC.

Cloud Cruiser scans all candidate emails for the presence of attachments names that match the supported date format. Only the year, month, and day are considered when matching. For example, for the date 2015-05-25, Cloud Cruiser searches for all attachments including the string 20150525.*

Cloud Cruiser downloads files to the cc-working/usage_files directory appropriate for the <customer_alias> part of the attachment name. Cloud Cruiser sorts downloaded files by customer and by date. For example, cc-working/usage_files/<customer_alias>/<YYYYMMDD>. Within the <YYYYMMDD> folder are two folders:

  • downloaded, where the raw downloaded attachments are placed
  • decrypted, where the uncompressed, decrypted (consumable) files are placed

Depending on the file type, Cloud Cruiser might perform further processing:

  • Zip and tgz files are decompressed and unpacked into the downloaded folder.
  • Encrypted files are decoded into the decrypted folder.
  • Non-encrypted files are moved into the decrypted folder.

Cloud Cruiser performs special processing for the following resource types:

Resource type Notes
3PAR

The <category> part of the attachment name is 3PAR.

The helper script is cc-working/scripts/get_3par.sh.

Cloud Cruiser parses the following sections of the report from the received csv file:

  • ShowSYS
  • ShowPDcols
  • ShowRC
  • ShowSYS
  • ShowVVRaw
P2000
  • The <category> part of the attachment name is P2000.
  • The helper script is cc-working/scripts/get_p2000.pl. Report files always arrive as identical pairs with the filename pattern P2000_<customer_alias>_silo-ctl[a|b]_<timestamp>. The helper scripts selects one of ctla or ctlb for processing.
StoreOnce

The <category> part of the attachment name is StoreOnce.

The helper script is cc-working/scripts/get_storeonce.pl.

Cloud Cruiser checks whether the file is Fusion Manager file, and discards all others. (Only Fusion Manager files contain usage data.) Cloud Cruiser creates new usage files based on the hostname and the appended .segment extension. Cloud Cruiser then populates usage files with data lines from the StoreOnce* file, and removes commas from measures.

XP7

The <category> part of the attachment name is either fcs_Thin or fcs_ThP.

The helper script is cc-working/scripts/get_xp7.pl.

Cloud Cruiser renames the files to a standard naming convention for collection and adds a header to csv files for collection purposes.  

After processing is complete, files move to Triage.

Maintenance

Usage retrieval is driven by the _DownloadUsage workbook. The download process is scheduled to begin at 03:30 UTC. Tthis time can be changed, but daily processing cannot commence until the download has fully completed. The workbook contains a single step, which is to call the download_usage.pl collection script.

You can monitor progress in the Monitoring area of the Cloud Cruiser Portal. The job should complete without warnings or errors. If there are warnings, investigate them immediately, as it can mean that usage files were not downloaded properly (leading to billing errors or lost revenue).

Advanced configuration

Use the download_usage.pl script for advanced configuration.

To change the IMAP server, use the following code:

#####################################################################################
# SET IMAP SERVER DETAILS HERE
#####################################################################################
my $socket = IO::Socket::SSL->new(
      PeerAddr => '<imap_server>',
      PeerPort => <port>,
      SSL_verify_mode => SSL_VERIFY_NONE
   )

To configure the IMAP account details, configure the following code:

#####################################################################################
# SET IMAP PASSWORD HERE
#####################################################################################
my $imap = Mail::IMAPClient->new
(
   User     => '<username>',
   Password => '<password>',
   Ssl      =>  1,
   Socket   => $socket,
   IgnoreSizeErrors => 1,
   Debug    => 0, Debug_fh => $dh
)

You can call the script directly from the command line or from an ExecuteCommand step in any workbook. The script has two input parameters:

  • The selection date in YYYMMDD format
  • The IMAP folder to query (for example, FC)

Using IMAP folders and server-side filtering rules, you can direct emails to dedicated IMAP folders to remove them from the common processing. You can then use the download_usage.pl script to download usage files separately, and potentially process then differently. For example:

  1. Create a filter to direct all email from customer AceRun to the IMAP folder FC/AceRun.
  2. Call download_usage.pl with the parameter FC/AceRun.

You can also use this method to test changes to the download_usage.pl script.

Triage

After data is retrieved and pre-processed by the nightly _DownloadUsage process, it is ready for triage. Triage includes the following activities, all performed by the _ProcessUsage workbook. (This workbook must be scheduled to run after the _DownloadUsage workbook.)

  • Gathering all of the data produced by the download process.
  • Performing a bulk conversion of this data into the Cloud Cruiser data format.
  • Applying processing common to all customers to the data on a device-by-device basis. For example, process all ShowPDcols data, then all Blade data, then all StoreOnce data, and so on.
  • Sorting the processed usage data into feeds used by the customer-specific workbooks.

The steps in triage are Collection, Worksheet processing, and Sorting.

Collection

The following table lists the source files in the ${env.usageDir}/*/${env.selectDate}/decrypted directory used by each collector.

Cloud Cruiser runs bulk collection on the following data files

Collector name Type Collection source Description
Collect-3PAR-ShowPDcols csv *.ShowPDcols ShowPDcols report from 3PAR array
Collect-BL-Servers-XML xml BL-Servers*.xml Metering (ON/OFF) report from server and blade metering script
Collect-3PAR-ShowSYS csv *.ShowSYS ShowSYS report from 3PAR array
Collect-SAN-Switches xml SAN-Switches*.xml metering (ON/OFF) report from Brocade SAN switches
Collect-StoreOnce csv *.segment metering reports from StoreOnce storage systems
Collect-P2000 xml P2000*-ctl_*.xml metering reports from P2000 storage systems
Collect-XP7-Thin csv fcs-Thin*.csv metering reports from XP7 storage systems
Collect-VM ccr dataset*.ccr collections from virtual machine environments

Worksheet processing

Each collection provides data to a sheet in the _ProcessUsage workbook, which performs activities specific to the type of data collected. The following table lists the input files in the cc-working/processing/_ProcessUsage directory used by each worksheet:

Worksheet Inputs
ShowSYS Output of the Collect-3PAR-ShowPDcols collector:
3PAR-ShowPDcols/${env.selectDate}.ccr
ShowPDcols

Output of the ShowSYS worksheet:
dataset_ShowSYS_${env.selectDate}.ccr

Output of the Collect-3PAR-ShowPDcols collector:
3PAR-ShowPDcols/${env.selectDate}.ccr

BL-Servers

Output of the Collect-BL-Servers-XML collector:
Collect-BL-Servers-XML/${env.selectDate}.ccr

SAN-Switches

Output of the Collect-SAN-Switches collector:
Collect-SAN-Switches/${env.selectDate}.ccr

StoreOnce Output of the Collect-StoreOnce collector:
Collect-StoreOnce/${env.selectDate}.ccr
P2000 Output of the Collect-StoreOnce collector:
Collect-StoreOnce/${env.selectDate}.ccr
XP7-Thin Output of the Collect-XP7-Thin collector:
Collect-XP7-Thin/${env.selectDate}.ccr
VM Output of the Collect-VM collector:
Collect-VM/${env.selectDate}.ccr
SortByCustomer

Output of the other sheets:

  • dataset_ShowPDcols_${env.selectDate}.ccr
  • dataset_BL-Servers_${env.selectDate}.ccr
  • dataset_SAN-Switches_${env.selectDate}.ccr
  • dataset_P2000_${env.selectDate}.ccr
  • dataset_XP7-Thin_${env.selectDate}.ccr
  • dataset_VM_${env.selectDate}.ccr 

For example, the ShowPDcols worksheet takes the data imported from the Collect-3PAR-ShowPDcols collector, and performs processing steps common to all customers:

  • Recovery of a device’s IP address from the metering filename
  • Removal of summary rows and empty rows
  • Conversion of MiB to GiB for various meters
  • Averaging of data when more than one metering file is received in a day
Workbook = _ProcessUsage
Sheet = ShowPDcols

Basic Action Step | Import Collections
    Active | Expression
    True   | ImportCollections("Collect-3PAR-ShowPDcols")
    
Transformation Step | Exception Limit = 0
    Active | Expression
    True   | SplitDimension(InputFilename, "_", fn, true, 4)
    True   | RenameDimension(fn_2, @customer, false)
    True   | RenameDimension(fn_3, IP_Address, false)
    True   | SetDimensionFromDimensions targetDimension : @collection, InputDimensionFormulas : {@feed() + "_", IP_Address()}
    True   | DeleteDimensions(fn_1, fn_4, @feed)
    True   | if DimensionMatchesValue(Id, "^-.*") Or DimensionMatchesValue(CagePos, total) then DeleteRow()
    True   | CalculateMeasure(Size, Size_MB / 1024)
    True   | CalculateMeasure(Volume, Volume_MB / 1024)
    True   | CalculateMeasure(Free, Free_MB / 1024)
    True   | CalculateMeasure(Failed, Failed_MB / 1024)
    True   | CalculateMeasure(Spare, Spare_MB / 1024)
    True   | SetDimensionFromLookup(Serial, ArrayMeta, LOOKUPFILE, "3PAR-Meta")
    True   | SplitDimension(ArrayMeta, "|", meta, true)
    True   | RenameDimension nameTable : {meta_1:ArrayName,meta_2:ArrayModel,meta_3:ArraySerial}, exceptionOnNoSource : false, overwrite : false
    True   | CalculateMeasure(Volume, Volume + Spare)
    True   | CalculateMeasure(Spare, 0)
    True   | CalculateMeasure(Usable, Size - Spare - Failed)
    True   | RenameMeasure(Volume, Used)
    True   | DeleteMeasures(Size_MB, Volume_MB, Free_MB, Failed_MB, Spare_MB, Size, Failed, Spare)

Basic Action Step | Aggregate Rows (Multi-File Average)
    Active | Expression
    True   | AggregateRows measureCalculation : AVG, interval : DAILY, adjustUsageInterval : true, createRecordCountMeasure : false

Sorting

After all bulk processing has been performed, the data is sorted back into customer-specific records to distribute the data that can be read in by the various per-customer workbooks. This is accomplished by the SortByCustomer worksheet.

Inputs to the SortByCustomer worksheet are the output from the other sheets. The data is sorted using the following format:
cc-working/processing/_byCustomer/<customer_alias>/<selectDate>.ccr

The input filename (the name of the original downloaded attachment) is included in the CCR data as a dimension. This property enables Cloud Cruiser to add data to the appropriate processing sheets in customer-specific workbooks, as described in Customer-specific processing.

Last modified

Tags

This page has no custom tags.

Classifications

This page has no classifications.

 (c) Copyright 2017-2020 Hewlett Packard Enterprise Development LP