Oracle8i Time Series User's Guide Release 8.1.5 A67294-01 |
|
This chapter explains concepts related to Oracle8i Time Series, and it provides information on using the product. It contains the following major sections:
A time series is a set of timestamped data entries. A time series allows a natural association of data collected over intervals of time. For example, summaries of stock market trading or banking transactions are typically collected daily, and are naturally modeled with time series.
A time series can be regular or irregular, depending on whether or not the time series has an associated calendar.
However, an irregular time series does not have to be used only for high-volume collection of unpredictable data. An irregular time series can be used with predictable data where it is simply not necessary to deal with a calendar. This approach is used in the quick-start demo described in Section 1.6.1.
Data generation for a time series begins with individual transactions, such as trades on a stock exchange or purchases of products. Each transaction has a timestamp and sufficient information to identify that transaction uniquely (such as a stock ticker or a product ID), as well as other pertinent information (such as the price and information to identify the party initiating the purchase or sale).
Individual transaction data is typically rolled up to produce summary data for a meaningful time period, such as a daily summary indicating the trade volume and the opening, high, low, and closing prices for each stock traded that day. This summary data is collected to produce historical data, such as a table of all daily volumes and opening, high, low, and closing prices for all stocks traded for the year 1997. For example, Figure 2-1 shows how data related to securities on a stock exchange is generated.
In Figure 2-1, each trade on the stock exchange includes several items of information, including a ticker and a price (for example, stock XYZ at 37.50). The daily summary data includes the opening, high, low, and closing prices for each ticker (for example, for XYZ: 37.75, 38.25, 37.00, 37.625). The daily data for each ticker is appended to the historical data for the ticker. The daily data is used for such purposes as quote server applications and listing in the next day's newspapers; the historical data is used by such applications as price and volume charting and technical analysis.
The data-collection model for historical data has the following characteristics:
This historical data is modeled using multiple regular time series.
Oracle8i Time Series and the Oracle8i utilities, with their bulk-loading capabilities and transactional semantics, are well suited for the requirements of time series data generation.
Oracle8i Time Series is especially useful in dealing with historical data. This type of data typically has relatively simple metadata but massive data storage requirements. That is, the data attributes (columns) are relatively few and easy to understand (such as ticker, volume, and opening, high, low, and closing prices); however, the number of rows is enormous (for example, data for all listed stocks for all trading days for several years). Moreover, the number of functions that users might want to perform on the data is large: for example, finding various sums, counts, maximum and minimum values, averages, number of trading days between two dates, moving average, and so on.
Figure 2-2 shows an example of historical data stored in a database.
Stock market historical databases have the following general characteristics:
This kind of financial historical data is used in examples in this guide and in the usage demo (see Section 1.6) provided with Oracle8i Time Series.
An Oracle8i Time Series calendar maps human-meaningful time values to underlying machine representations of time.
A calendar can be associated with a time series. However, a calendar does not need to be associated with a time series unless you need to do any of the following:
A time series with an associated calendar is called a regular time series, as described in Section 2.1.1.
A calendar is, of course, necessary for using any of the calendar functions (described in Section 2.9).
A business day calendar, for example, can define the days of the week on which stocks are traded. The holidays, when trading does not occur, are also included in the calendar as exceptions.
If you have more than one time series with the same timestamps, you can associate these time series with the same calendar (that is, use a shared calendar). For example, a calendar of U.S. stock market trading days could be used for all stocks that trade every day. For any stocks that do not fit this pattern, you could create private calendars (for example, an ACME_cal calendar for stock ACME).
The following are key components of a calendar:
A frequency specifies the granularity of the calendar representation. The supported frequencies are second, minute, hour, day, week, 10-day, semi-monthly, month, quarter, semi-annual, and year.
The pattern specifies the repeating pattern of frequencies and an anchor date that identifies a valid timestamp for the first element in the pattern. For example, if the frequency is set to day, the pattern can define which days of the week are included in the calendar. For example, a pattern of `0,1,1,1,1,1,0' over a day frequency defines a calendar over all weekdays. If an anchor date of 01-Jun-1997 (or any Sunday) is specified, then the 7-day pattern begins each Sunday; and Sunday and Saturday (0) are excluded from the calendar, while Monday through Friday (1) are included in the calendar.
Exceptions are timestamps that do not conform to the calendar pattern but that are significant for the calendar definition. There are two kinds of exceptions: off-exceptions and on-exceptions:
On-exceptions can also be used with a zero pattern. Section 2.11 includes a description of using such a calendar for scaling, with quarterly dividend payment dates as on-exceptions.
Each frequency has an associated integer code that is used in function calls. Table 2-1 lists the supported frequencies and their integer codes. The frequencies are explained in Table 2-2.
Frequency | Integer Code |
---|---|
second | 1 |
minute | 2 |
hour | 3 |
day | 4 |
week | 5 |
month | 6 |
quarter | 7 |
year | 8 |
10-day | 10 |
semi-monthly | 16 |
semi-annual | 18 |
Some frequencies allow flexibility in defining pattern anchor dates, whereas other frequencies are more restrictive. Table 2-2 explains the frequencies and any requirements and options relating to the pattern anchor date.
Each frequency has an associated precision. Oracle8i Time Series functions require that input timestamps be of the precision of the frequency associated with the calendar. (The SetPrecision function is the exception: this function takes a calendar and a timestamp and returns a timestamp that conforms to the frequency of the associated calendar.)
A timestamp that is not consistent with the frequency is said to be imprecise. For example, a timestamp of 09-Sep-1997 is imprecise if it is input to a function that is dealing with a calendar whose frequency is 6 (month) or 8 (year) and whose pattern anchor date is not the 9th of a month. When you create a calendar, all timestamps used in the calendar definition (the anchor date for the pattern, and all off- and on-exceptions) must be precise with respect to the frequency. For example, the calendar will not be valid if you specify a frequency of day, an anchor date of 01-Jun-1998 13:00:00. An anchor date of just 01-Jun-1999, however, would valid in this case. (The calendar data types and their attributes are presented in Section 2.3.1.)
Table 2-3 shows the frequencies, their precision conventions, and an example timestamp of each precision using a pattern anchor date of 01-Jan-1998 00:00:00 (midnight), which was a Thursday.
A calendar pattern is specified as one or more zeroes (0) and/or positive integers.
For patterns represented by zeroes and/or ones, each '1' represents a valid timestamp of the frequency and each '0' represents an invalid timestamp. For example:
For patterns containing one or more integers greater than 1, each such integer represents an interval that is a multiple of the frequency. For example, a calendar with a day frequency, a Sunday anchor date, and a pattern of '7' (ORDSYS.ORDTPatternBits(7)) is in effect a weekly calendar where all Sunday timestamps are included and all other days of the week are excluded.
Note that while the actual timestamps that are valid for the calendar will be identical for each of the preceding weekly calendar examples (ORDSYS.ORDTPatternBits(1,0,0,0,0,0,0) and ORDSYS.ORDTPatternBits(7) with day frequency and Sunday anchor dates), these calendars have two different interpretations for use in the context of time scaling. For example, if the ScaleupSum function is invoked on a time series containing data defined for every day for scaling to these two calendars, the following differences in behavior occur:
To define a calendar, you create a table in which to store calendar definitions and then store a row for each calendar to be defined.
Example 2-1 creates a table named stockdemo_calendars and defines a calendar named BusinessDays. The BusinessDays calendar includes Mondays through Fridays, but excludes 28-Nov-1996 and 25-Dec-1996. Explanatory notes follow the example. (For more information and examples of calendar creation, see Section 3.2.)
CREATE TABLE stockdemo_calendars of ORDSYS.ORDTCalendar ( name CONSTRAINT calkey PRIMARY KEY); INSERT INTO stockdemo_calendars VALUES( ORDSYS.ORDTCalendar( 0 'BusinessDays', 4, ORDSYS.ORDTPattern( ORDSYS.ORDTPatternBits(0,1,1,1,1,1,0), TO_DATE('01-JAN-1995','DD-MON-YYYY')), TO_DATE('01-JAN-1990','DD-MON-YYYY'), TO_DATE('01-JAN-2001','DD-MON-YYYY'), ORDSYS.ORDTExceptions(TO_DATE('28-NOV-1996','DD-MON-YYYY'), TO_DATE('25-DEC-1996','DD-MON-YYYY')), ORDSYS.ORDTExceptions() ));
Notes on Example 2-1:
When you want to create calendars that conform to time series data, you can use the DeriveExceptions function to simplify the process. You can use one of several approaches with DeriveExceptions, depending on your needs and the requirements for each approach:
Approach 1 is the most convenient, and is recommended for most customers.
While Approaches 1 and 1A can be performed in a single step, Approach 2 requires an additional step (before DeriveExceptions is called) in order to construct the first time series.
Although Approaches 1 and 1A are simpler in practice, Approach 2 has significant performance advantages when you need to define multiple calendars that have the same frequency and pattern but different exception lists. The first two approaches are less efficient than Approach 2 in this case, because the internal implementation of the first two approaches generates a collection of dates based on the input calendar. If you need to derive exceptions for multiple calendars defined on the same frequency and pattern, this date-generation operation is performed multiple times. You can avoid these multiple date-generation operations by using Approach 2.
Section 3.8 contains more detailed information about using each approach to deriving calendar exceptions.
Oracle8i Time Series provides data types for working with calendars and time series.
All Oracle8i Time Series data types are installed under the ORDSYS schema, and all users must include the ORDSYS schema name when referring to these data types.
Oracle8i Time Series provides the following calendar data types. (Time series data types are described in Section 2.3.2.)
CREATE TYPE ORDSYS.ORDTCalendar AS OBJECT ( caltype INTEGER, name VARCHAR2(256), frequency INTEGER, pattern ORDSYS.ORDTPattern, minDate DATE, maxDate DATE, offExceptions ORDSYS.ORDTExceptions, onExceptions ORDSYS.ORDTExceptions);
CREATE TYPE ORDSYS.ORDTPatternBits AS VARRAY(32500) OF INTEGER; CREATE TYPE ORDSYS.ORDTPattern AS OBJECT ( patBits ORDSYS.ORDTPatternBits, patAnchor DATE);
CREATE TYPE ORDSYS.ORDTExceptions AS VARRAY(32500) OF DATE;
Oracle8i Time Series provides the following time series data types. (Calendar data types are described in Section 2.3.1.)
CREATE TYPE ORDSYS.ORDTNumCell AS OBJECT (tstamp DATE, value NUMBER); CREATE TYPE ORDSYS.ORDTNumTab AS TABLE OF ORDSYS.ORDTNumCell; CREATE TYPE ORDSYS.ORDTNumSeries AS OBJECT ( name VARCHAR2(256), cal ORDSYS.ORDTCalendar, series ORDSYS.ORDTNumTab ); CREATE TYPE ORDSYS.ORDTNumSeriesIOTRef AS OBJECT ( name VARCHAR2(256), cal REF ORDSYS.ORDTCalendar, table_name VARCHAR2(256), tstamp_colname VARCHAR2(30), value_colname VARCHAR2(30), qualifier_colname VARCHAR2(30), qualifier_value VARCHAR2(4000) ); CREATE TYPE ORDSYS.ORDTVarchar2Cell AS OBJECT (tstamp DATE, value VARCHAR2(4000)); CREATE TYPE ORDSYS.ORDTVarchar2Tab AS TABLE OF ORDSYS.ORDTVarchar2Cell; CREATE TYPE ORDSYS.ORDTVarchar2Series AS OBJECT ( name VARCHAR2(256), cal ORDSYS.ORDTCalendar, series ORDSYS.ORDTVarchar2Tab ); CREATE TYPE ORDSYS.ORDTVarchar2SeriesIOTRef AS OBJECT ( name VARCHAR2(256), cal REF ORDSYS.ORDTCalendar, table_name VARCHAR2(256), tstamp_colname VARCHAR2(30), value_colname VARCHAR2(30), qualifier_colname VARCHAR2(30), qualifier_value VARCHAR2(4000) ); CREATE TYPE ORDSYS.ORDTDateTab AS TABLE OF DATE;
The preceding statements show the definition of a numeric time series and a character time series (instance-based and reference-based interfaces), each composed of a calendar instance and a collection. The collection (ORDTxxxTab) is defined as a table of ORDTxxxCell (except for ORDTDateTab, which is a table of DATE). Oracle8i Time Series data types, such as ORDTNumSeries and ORDTVarchar2Series, are input and output parameters of time series functions.
The following statements show the definitions for the ORDTDateRange and ORDTDateRangeTab types. The latter is returned by the GenDateRangeTab function, which is described in Chapter 4.
CREATE TYPE ORDSYS.ORDTDateRange AS OBJECT (start_date DATE, end_date DATE); CREATE TYPE ORDSYS.ORDTDateRangeTabTab AS TABLE OF ORDSYS.ORDTDateRange;
For time series functions that accept two time series, both time series must be defined on calendars that have the same frequency and the same pattern. The calendars may have different exceptions lists and different starting and ending dates.
A number of time series functions perform arithmetic, comparison, and grouping operations. When nulls are encountered in this context, the default behavior is to mirror SQL:
For example, the sum of (1, NULL, NULL, 3) returns 4. The sum of (NULL, NULL, NULL, NULL) returns null.
For example, if there are 5 nulls in the last 30 timestamps for (and including) a specific date, the 30-day moving average on that date is computed using only 25 values (that is, adding only the values that are not null and dividing by 25). However, if all 30 dates (the date and the 29 previous dates) have nulls, the moving average for that date is null.
For example, 10 + NULL returns null.
For example, a GT comparison of 30-Jun-1997 and null returns null.
Note that because PL/SQL does not implement UNKNOWN, these semantics are slightly different than the SQL treatment of comparisons with nulls. In SQL, a comparison operator that encounters a null returns UNKNOWN, which is like a null, except that operations on UNKNOWN return UNKNOWN.
For example, if you are scaling up daily data from 01-Jan-1997 through 30-Jun-1997 to monthly data, and if there are no values for the month of February, a null is returned for February and scaled data is returned for the other months. (Note that this behavior differs from the standard GROUP BY scaling in SQL, in which February would be missing in the scaled results.)
Some functions allow alternate semantics in the form of an option. For example, the scaleup functions allow you to specify IgnoreNulls or IgnoreNullsOFF, as explained in Section 2.11.2. The reference information for each function describes any alternate semantics options.
In comparisons of two time series, it is possible that a timestamp valid for one time series is not valid for the other time series. Operations on two time series having similar calendars return a time series that is defined over a new calendar. This new calendar is derived from the two input calendars, using all of the following:
For example, assume the following two calendars:
The new (derived) calendar is: 01-Feb-1997 through 01-Dec-1997, daily pattern '0,1,1,1,1,1,0' (Monday through Friday), off-exceptions 01-May-1997 and 14-Jul-1997, on-exception 29-Jun-1997.
Figure 2-3 shows the Oracle8i Time Series architecture. At the lowest level, a storage option is required, and this must be a flat index-organized table (IOT), a flat table, or a nested IOT. The actual product consists of PL/SQL packages for calendar, time series, and time scaling functions and for administrative tools procedures. In addition, a collection-based interface between time series storage and the packaged functions is provided.
The rest of this chapter describes this architecture, working from bottom to top in Figure 2-3:
Using Oracle8i Time Series involves storing three different kinds of information:
In the flat table or IOT storage model, these requirements are implemented using three separate tables, and time series detail data is stored as multiple rows in the table or IOT. (Figure 1-1 in Section 1.6.2 shows the tables created by the usage demo.)
In the nested table storage model, the time series detail data is stored as an object in a nested table, that is, as rows at the second level of a nested IOT.
In the time series detail table, the data should be stored in timestamp order, because many of the analytical functions require access to the data in this order. (If timestamps are not in order, the functions perform an internal sort before processing the timestamps.)
A time series is stored as multiple rows in a flat index-organized table (IOT) or a flat table.1 Each row stores a ticker, a timestamp, and composite data. This storage option is shown in Figure 2-2.
The flat IOT or table model has some benefits compared to nested IOTs:
For time series data such as stock market data, where multiple attributes (such as open, high, low, volume and close) share a single timestamp, only a single timestamp column needs to be stored, thus providing efficient utilization of disk storage.
Migrating legacy data is simplest if the target schema is a flat table or flat IOT.
However, using this model means that ensuring time series and calendar integrity cannot be encapsulated. The highest-integrity solution to time series and calendar integrity would disable insert, update, and delete operations using SQL and would implement these operations using member methods of time series and time scaling functions. This approach is not possible with a flat IOT storage model. Instead, a relational view must be defined to ensure integrity (see Section 2.8.2).
To minimize the burden of creating the relational view and other required schema objects, you can use the administrative tools procedures described in Section 2.12.
A time series is stored as rows in a nested IOT. At the first nesting level, the ticker symbol and any metadata associated with the time series are stored. At the second level and associated with each ticker symbol, timestamp and composite data is stored.
The nested IOT storage model has the following advantages compared to the flat IOT or table model:
Time series analysis functions, such as Moving Average, operate on time series data represented by collections that are passed as parameters to PL/SQL functions. This strategy is inefficient when only a portion of the time series is accessed because the server materializes the entire collection into the object cache.
Locator-based access to nested IOTs provides a solution to this problem, by providing PL/SQL functions a handle to the nested table.
The two levels of a nested table allow you to store metadata associated with each time series (such as textual descriptions of the instrument, or information related to stock splits). This simplifies schema management.
Derived time series are those that are computed by applying functions to existing time series. In a composite (multiple-attribute) model, the storage of derived time series is complicated by the fact that derived time series data is available for only one of many columns (at least initially).
With a nested table storage model, a single value column is stored. This cleanly enables the storage of derived time series.
However, the nested IOT storage model has the following disadvantage compared to the flat IOT model: each attribute column requires a separate timestamp column. This increases the storage overhead for multiple-attribute time series data (such as daily stock market data that includes open, high, low, close, and volume attributes). However, many forms of time series data are single-attribute (such as the monthly unemployment rate), and for these formats the nested IOT storage model is ideal.
The interfaces to the time series and time scaling functions rely on the following aspects of the Oracle8i Time Series architecture:
Two basic interfaces to time series and time scaling functions are defined:
In the instance-based interface, the first input parameter to a time series function is an instance of a time series (for example, ORDTNumSeries).
In the reference-based interface, the first input parameter to a time series function is a reference to a time series (for example, ORDTNumSeriesIOTRef). The reference-based interface requires that you provide enough descriptive information to enable the functions to execute dynamic SQL to obtain an instance of a time series.
The data types related to the instance-based and reference-based interfaces (for example, ORDTNumSeries and ORDTNumSeriesIOTRef) are discussed in Sections 2.7.1 and 2.7.2.
Note that both types of interfaces return only instances of time series (for example, ORDTNumSeries). Also, because nesting of time series and time scaling functions is allowed (for example, SELECT (Lead(Mavg, ...) ...)), the instance-based interface is used internally for the second and subsequent levels of nesting.
When possible, you should use the reference-based interface. Although this interface may be difficult to understand initially, it offers significant performance advantages over the instance-based interface. The examples in this guide emphasize the reference-based interface.
Time series and time scaling functions operate on instances of time series objects (for example, an ORDTNumSeries). An instance of a time series object includes a name field, an instance of a calendar, and an instance of a time series. For example, as the following data type definitions for a numeric time series show, ORDTNumTab defines a collection and ORDTNumSeries combines a calendar instance with a collection:
CREATE TYPE ORDSYS.ORDTNumCell AS OBJECT (tstamp DATE, value NUMBER); CREATE TYPE ORDSYS.ORDTNumTab AS TABLE OF ORDTNumCell; CREATE TYPE ORDSYS.ORDTNumSeries AS OBJECT ( name VARCHAR2(256), cal ORDTCalendar, series ORDTNumTab );
For a numeric time series, the time series data is contained in the ORDTNumTab structure. This structure is a table (collection) of a DATE column and a NUMBER column.
Figure 2-4 shows an example of an ORDTNumTab collection type.
Tstamp | Value |
---|---|
01-01-1996 |
22.00 |
01-02-1996 |
23.00 |
... |
... |
12-31-1996 |
... |
Functions such as Mavg (Moving Average, described in Section 2.10.7) use the ORDTNumTab structure as the source data for performing computations, and they use the ORDTCalendar data type to enable navigation through the time series data. The calendar-based navigation is especially useful for functions such as Mavg, which has as input parameters the starting date (startDate) and ending date (endDate) for which to return moving averages and an integer (k) indicating the lookback window (k denoting the number of timestamps, including the current one, over which to compute the moving average). Calendar-based navigation is used to determine the date that is k-1 timestamps previous to startDate.
Although time series and time scaling functions operate on time series instances, they are invoked from SQL using a REF to a time series. For a numeric time series, this type is an ORDTNumSeriesIOTRef. (Section 2.7.2 explains the use of REFs in the reference-based interface.) The REF contains enough information so that time series and time scaling functions can derive the instance (ORDTNumSeries) at runtime (using dynamic SQL).
The convention of defining an interface on a DATE column and a single NUMBER column provides a uniform interface for time series and time scaling functions. Because the underlying IOT that stores time series detail data may have multiple NUMBER columns, the view defining the REF also maps the underlying storage to conform to the two-column interface defined by the ORDTNumSeries data type.
The following are the key aspects of the instance-based interface to time series and time scaling functions:
Oracle8i Time Series provides a reference-based interface for time series and time scaling functions.
This interface provides efficient performance, especially when only a portion of the time series is accessed. The performance benefit of this interface results from the fact that at runtime, the reference-based interface materializes only those rows within the specified date range, as opposed to materializing the entire collection of rows from the time series.
Note: You should use administrative tools procedures (documented in Chapter 7) to create and maintain objects for use with the reference-based interface. Normally, you should not manually create any views used by the reference-based interface, because the specific view definitions may change in future releases and because the administrative tools procedures provide a simple interface. |
The reference-based interface uses the ORDTNumSeriesIOTRef and ORDTVarchar2SeriesIOTRef data types, which include a REF to a calendar, plus several literal values. At runtime, reference-based time series and time scaling functions use these literal values to form and execute a SQL statement (using dynamic SQL) that derives an instance of a time series that contains only the timestamps needed for this instance. The function determines which timestamps are needed based on the startDate and endDate parameters.
The ORDTNumSeriesIOTRef data type is defined as follows:
CREATE TYPE ORDSYS.ORDTNumSeriesIOTRef AS OBJECT ( name VARCHAR2(256), cal REF ORDSYS.ORDTCalendar, table_name VARCHAR2(256), tstamp_colname VARCHAR2(30), value_colname VARCHAR2(30), qualifier_colname VARCHAR2(30), qualifier_value VARCHAR2(4000) );
The attributes of the ORDTNumSeriesIOTRef data type are as follows:
table_name can be a view, but the view must be updatable and must map to an IOT. If the view includes any functions, they must include the PRAGMA RESTRICT_REFERENCES compiler directive with the keywords WNPS, RNPS, and WNDS.
In the Oracle8i Time Series usage demo, the view stockdemo_ts uses the reference-based interface to time series and time scaling functions. The stockdemo_ts view determines which calendar should be associated with the time series by accessing the calendar (stockdemo_calendars) and metadata (stockdemo_metadata) tables. The pricing data is accessed through the underlying table containing historical time series pricing data (stockdemo). For an explanation of the relationship between the reference-based view and the underlying tables in the usage demo, see Section 1.6.2.3.
The stockdemo_ts view is generated by the administrative tools procedures (documented in Chapter 7) with the following definition for the current release of Oracle8i Time Series. (This definition may change in future releases, and therefore you are encouraged to use the administrative tools procedures rather than manually creating such a view.)
CREATE OR REPLACE VIEW stockdemo_ts(ticker,open,high,low,close,volume) AS SELECT meta.tickername, ORDSYS.ORDTNumSeriesIOTRef( substr(meta.tickername, 1, 230) || ' open NumSeries', Ref(cal), 'tsdev.stockdemo', 'tstamp', 'open', 'ticker', meta.tickername), ORDSYS.ORDTNumSeriesIOTRef( substr(meta.tickername, 1, 230) || ' high NumSeries', Ref(cal), 'tsdev.stockdemo', 'tstamp', 'high', 'ticker', meta.tickername), ORDSYS.ORDTNumSeriesIOTRef( substr(meta.tickername, 1, 230) || ' low NumSeries', Ref(cal), 'tsdev.stockdemo', 'tstamp', 'low', 'ticker', meta.tickername), ORDSYS.ORDTNumSeriesIOTRef( substr(meta.tickername, 1, 230) || ' close NumSeries', Ref(cal), 'tsdev.stockdemo', 'tstamp', 'close', 'ticker', meta.tickername), ORDSYS.ORDTNumSeriesIOTRef( substr(meta.tickername, 1, 230) || ' volume NumSeries', Ref(cal), 'tsdev.stockdemo', 'tstamp', 'volume', 'ticker', meta.tickername) FROM stockdemo_metadata meta, stockdemo_calendars cal WHERE meta.calendarname = cal.name;
Depending on which column is selected, a different literal value is applied as an attribute of the ORDTNumSeriesIOTRef data type. For example, for the following query:
SELECT ORDSYS.TimeSeries.Mavg(close, to_date('02-DEC-96','DD-MON-YY'), to_date('31-DEC-96','DD-MON-YY'), 10) FROM TSDEV.stockdemo_ts WHERE ticker='ACME';
The literal value close is used as the value_colname column name. The other attributes of the ORDTNumSeriesIOTRef data type include the timestamp column name (tstamp), a qualifying column name (ticker), and the actual value of the qualifying column (meta.tickername).
The implementation of time series and time scaling functions uses the information stored in the ORDTNumSeriesIOTRef data type to generate the appropriate dynamic SQL statement at runtime. Using the preceding example, to instantiate a time series object (that is, to convert an ORDTNumSeriesIOTRef to an ORDTNumSeries), the Mavg function generates a query that performs the following action (with the logic shown, not the exact syntax):
SELECT tstamp, close FROM tsdev.stockdemo_ts WHERE ticker='ACME' and tstamp BETWEEN <a date range adjusted to reflect the 10-day window and the calendar, including any holidays>;
The Mavg function computes the moving average and returns the result as a time series instance (ORDTNumSeries). For more information about the Mavg function, see Section 2.10.7.
Most time series and time scaling functions rely on calendars that are consistent with time series data.2 By assuming a time series is consistent with its calendar, time series and time scaling functions can use the calendar as a basis for navigation of time series data.
Time series consistency must be maintained; otherwise, functions might raise exceptions or return incorrect results.
For a time series to be consistent, the following must be true:
If some mechanism is not used to enforce these consistency rules, accidental or malicious actions could destroy the integrity of the time series data. For example, a user might delete rows from the middle of the time series, rather than being restricted to deleting rows at the beginning and the end of the date range for the time series.
Enforcing time series consistency can be accomplished with a relational view of time series data that uses an INSTEAD OF trigger to maintain time series consistency. (For an explanation of INSTEAD OF triggers, see the Oracle8i Concepts manual.) This relational view is intended to be used for limited or moderate insert, update, and delete operations; it is not intended for bulk changes to time series data.
The usage demo (see Section 1.6) includes a relational view named stockdemo_sv. This view:
With the relational view, if a timestamp to be inserted is imprecise, an exception is raised. If a timestamp to be deleted is imprecise and if a matching timestamp exists in the time series, the deletion is permitted.
An INSTEAD OF trigger in a relational view enforces rules on insert, delete, and update operations. These rules maintain time series data that conforms to the associated calendar.
For insert operations, the following rules apply:
For delete operations, the following rules apply:
For update operations, the following rules apply:
The INSTEAD OF trigger in a relational view enables you to ensure that a time series meets the consistency requirements described in Section 2.8.1.
The INSTEAD OF trigger allows for multiple timestamps to be inserted or deleted in a single query, given that the group of timestamps inserted or deleted are in the proper order. For example, a specified number of timestamps can be deleted from the beginning of a time series by using a simple range restriction on the timestamp. A specified number of timestamps can be inserted at the end of a time series by using a subquery that references another table containing time series data.
The SQL*Loader utility is useful for loading large amounts of data into a table. For better performance, you should perform bulk loads on underlying tables instead of on relational views. However, after you load data into the tables, you must ensure time series consistency by using one of the following approaches:
If you are sure that all timestamps are correct, it is safe to adjust the calendar to be consistent with the time series. This strategy is normally appropriate when there is a unique calendar per time series.
The DeriveExceptions function is useful for adjusting a calendar to be consistent with the time series.
If you expect time series data to adhere to a predefined calendar, validating each time series is the better approach. This approach is particularly useful if the same calendar is used for all time series data being loaded.
The IsValidTimeSeries function can be used to check if the time series is consistent with the calendar.
For better performance in the case of a shared calendar for all time series, you may want to customize time series validation using PL/SQL. This involves writing custom utility functions that call Oracle8i Time Series product-developer calendar functions (see Section 2.9.2) to test and maintain time series consistency.
Section 3.4 contains additional information and examples of bulk and incremental loading of time series data.
Oracle8i Time Series provides calendar functions for querying and modifying calendars. The calendar functions can be divided into the following categories:
Reference information for all calendar functions is in Chapter 4.
End-user functions let you use the main calendar-related features of Oracle8i Time Series. If you do not need to modify or expand the product's capabilities, you probably can limit your use of calendar functions to those listed in Table 2-4.
Function | Description |
---|---|
Calendar-Creation Functions1 | |
Second | Creates a calendar with a frequency of second. |
Minute | Creates a calendar with a frequency of minute. |
Hour | Creates a calendar with a frequency of hour. |
Day | Creates a calendar with a frequency of day. |
Week | Creates a calendar with a frequency of week. |
Ten_day | Creates a calendar with a frequency of 10-day. |
Semi_monthly | Creates a calendar with a frequency of semi-monthly. |
Month | Creates a calendar with a frequency of month. |
Quarter | Creates a calendar with a frequency of quarter. |
Semi_annual | Creates a calendar with a frequency of semi-annual. |
Year | Creates a calendar with a frequency of year. |
Calendar-Related Functions | |
EqualCals | Returns 1 if the two calendars are equivalent. If a date range is provided, tests only equivalence between the supplied dates. |
GenDateRangeTab | Returns a table of date ranges that represent all of the valid intervals in the input calendar (or from startDate through endDate). |
IntersectCals | Intersects two calendars. |
UnionCals | Returns the union of two calendars. |
IsValidCal | Returns 1 if a calendar is valid and 0 if a calendar is not valid. |
ValidateCal | Validates a calendar; repairs errors where possible. |
Exception-Related Functions | |
InsertExceptions | Inserts a list of timestamps into the appropriate exceptions list or lists. |
DeleteExceptions | Deletes a list of timestamps from the appropriate exceptions list or lists. |
Product-developer functions let you modify and expand the Oracle8i Time Series capabilities. For example, you could use product-developer calendar functions in creating a new function that modified the information returned for the moving average or that returned a net present value for a portfolio of stocks at a specified date.
Table 2-5 lists the product-developer calendar functions.
For an example of using product-developer functions, see Section 3.9.
Time series functions operate on a time series. A time series data type is always used as the input parameter to a time series function.
Reference information for all time series functions is in Chapter 5.
Time series extraction, retrieval, and trim functions operate on any time series type. Extraction functions return one or more time series rows, while retrieval and trim functions return a time series.
Table 2-6 lists the extraction functions.
Table 2-7 lists the retrieval and trim functions.
Shift functions (listed in Table 2-8) lead or lag a time series by a specified number of units, where units reflects the frequency of the calendar for the time series.
Function | Description |
---|---|
Lead | Leads a time series by the specified number of units. |
Lag | Lags a time series by the specified number of units. |
When called from a SQL SELECT expression, a time series function returns an instance of a time series data type, which cannot be displayed. The SQL formatting functions (listed in Table 2-9) facilitate format conversions that allow time series to be displayed.
Aggregate functions (listed in Table 2-10) return scalar or ORDTNumTab values. Each aggregate function can be used in either of the following ways:
Thus, each aggregate function is of the form:
f(ts ORTDNumSeries, [date1 DATE, date2 DATE])
Arithmetic functions (listed in Table 2-11) accept two time series (ORDTNumSeries1,ORDTNumSeries2) or a time series and a constant (ORDTNumSeries1, Const), and perform a pairwise arithmetic operation on each element of the time series. This operation determines the value of each element of the returned time series:
Algorithm for f(ts1, ts2) ForAll i, tsRet(i) = ts1(i) op ts2(i);
Function | Description |
---|---|
TSAdd | Time series addition |
TSDivide | Time series division |
TSMultiply | Time series multiplication |
TSSubtract | Time series subtraction |
Cumulative sequence functions (listed in Table 2-12) operate on successive elements of a time series, accumulating the result into the current element of the output time series. For example, CSUM((1,2,3,4,5)) => (1,3,6,10,15). In this example, the result time series (f(i)), is computed from the input time series (I(i)) as follows:
f(1) = I(1) ForAll i > 1, f(i) = f(i - 1) + I(i)
Function | Returns |
---|---|
Cavg | Cumulative average |
Cmax | Cumulative maximum |
Cmin | Cumulative minimum |
Cprod | Cumulative product |
Csum | Cumulative sum |
The Moving Average (Mavg) function returns a time series that contains the averages of values from each successive timestamp for a specified interval over a range of dates. For example, the 30-day moving average for a stock is the average of the closing price for the specified date and the 29 trading days preceding it.
The Moving Sum (Msum) function returns a sum of values from each successive timestamp for a specified interval over a range of dates. For example, the 30-day moving sum of trading volumes for a stock is the sum of the volume for the specified date and for 29 trading days preceding it.
Table 2-13 lists the moving average and sum functions.
Function | Returns |
---|---|
Mavg | Moving average |
Msum | Moving sum |
The relationship between the input and output time series in the computation of a moving average or sum is illustrated in Figure 2-5. The figure focuses on the common invocation of moving average or sum, where k is the number of timestamps in the lookback window (for example, 30) and a date range (startDate and endDate) is supplied. (For more information about the parameters, see the Mavg function description in Chapter 5.)
Conversion functions (see Table 2-14) convert a time series by filling in missing timestamp-value pairs. With the Fill function, any timestamps that are valid calendar timestamps but missing from the time series are inserted into the time series. This function is especially useful for converting a time series from one calendar to another.
Function | Description |
---|---|
Fill | Fills a time series based on the calendar and fill type. |
Oracle8i Time Series provides functions to scale time series data:
The relationship between the input and output time series in a scaleup operation is illustrated in Figure 2-6, which shows a mapping when scaling from a daily frequency to a monthly frequency.
Figure 2-6 shows all days in February being mapped to the month of February. This mapping also suggests the importance of the precision of timestamps of different frequencies. In the example shown in this figure:
Time scaling is permitted only when the calendar for the target time series is an integral multiple of the calendar for the data to be scaled. For example, weekly data (data associated with a calendar with a week frequency) cannot be scaled up to a calendar with a frequency of month, quarter, half year, or year because a week does not divide evenly into any of these time periods. However, monthly data can be scaled up to a calendar with a frequency of quarter, half year, or year.
Table 2-15 provides a scaling compatibility matrix that shows for each frequency the frequencies to which you can scale up and scale down data. For each cell in the matrix, a Y means that scaling is permitted and a blank means that scaling is not permitted. For scaleup operations, go down the left column to find the desired scale-from frequency, then go across that row to see if scaling is permitted for the desired scale-to frequency. For scaledown operations, go across the top row to find the desired scale-from frequency, then go down that column to see if scaling is permitted for the desired scale-to frequency.
|
Day |
Week |
10-day |
Semi-month |
Month |
Quarter |
Semi-annual |
Annual |
Day |
Y |
Y |
Y |
Y |
Y |
Y |
Y |
Y |
Week |
|
Y |
|
|
|
|
|
|
10-day |
|
|
Y |
|
Y |
Y |
Y |
Y |
Semi-month |
|
|
|
Y |
Y |
Y |
Y |
Y |
Month |
|
|
|
|
Y |
Y |
Y |
Y |
Quarter |
|
|
|
|
|
Y |
Y |
Y |
Semi-annual |
|
|
|
|
|
|
Y |
Y |
Annual |
|
|
|
|
|
|
|
Y |
If the calendar for the target time series has a zero ('0') pattern and one or more on-exceptions that are precise with respect to the calendar frequency, the time scaling functions use the on-exceptions to perform scaling. For example, if quarterly dividend payment dates are defined as on-exceptions with a day frequency calendar that has a zero pattern (ORDSYS.ORDTPatternBits(0)), the ScaledownRepeat function could be used to insert the current quarterly dividend rate in each daily timestamp.
The collection-based interface (operations on collections) for time scaling is discussed in Section 2.11.1.
Reference information for all time scaling functions is in Chapter 6.
The scaleup and scaledown functions accept as input a numeric time series and a destination calendar. A numeric time series is returned, which is scaled based on the destination calendar.
For example, the following statement returns the last closing prices for stock SAMCO for the months of October, November, and December of 1996:
select * from the (select cast(ORDSYS.TimeSeries.ExtractTable( ORDSYS.TimeSeries.ScaleupLast( ts.close, sc.calendar, to_date('01-OCT-1996','DD-MON-YYYY'), to_date('01-JAN-1997','DD-MON-YYYY') ) ) as ORDSYS.ORDTNumTab) from tsdev.stockdemo_ts ts, tsdev.scale sc where ts.ticker='SAMCO' and sc.name ='MONTHLY');
This example might produce the following output:
TSTAMP VALUE --------- ---------- 01-OCT-96 42.375 01-NOV-96 38.25 01-DEC-96 39.75 3 rows selected.
Note that each timestamp reflects the first date of the month in the calendar (following the convention illustrated in Table 2-3), and each value in this case reflects the closing price on the last date for that month in the calendar.
Scaleup functions ignore nulls. For example, ScaleupAvg returns a time series reflecting the average value of each scaled group of non-null values.
Table 2-16 lists the scaleup functions, and Table 2-17 lists the scaledown functions.
All scaleup functions allow you to specify either or both of the following options:
These options do not apply to scaledown functions.
IgnoreNulls controls the behavior with respect to null values. If IgnoreNulls is in effect, nulls in the input time series are not be included in the aggregation being performed. For example, if ScaleupAvg is operating on a group with 12 values, 3 of which are null, only the 9 non-null values are averaged. If IgnoreNullsOFF is specified, then any calculation involving one or more nulls results in a null. For example, if IgnoreNullsOFF is specified and ScaleupAvg is operating on a group with 12 values, 3 of which are null, a null is returned.
DiscardError controls the behavior with respect to gaps in the target calendar. If DiscardError (the default setting) is in effect, then whenever data from the source time series has no corresponding interval in the target time series (which can result from zeros in the pattern bits of the target calendar), an exception is raised. An example of this condition is scaling up daily data to a monthly calendar for the months January through March (anchor date of 01-Jan and pattern of '1,1,1,0,0,0,0,0,0,0,0,0'). Using this example, the default behavior (DiscardError) raises an exception if any input timestamps are from April through December; however, DiscardErrorOFF performs the scaling for January through March and ignores any input timestamps from April through December. The default behavior ensures that certain types of incompatible calendars are not inadvertently used in scaling.
You can express the IgnoreNulls and DiscardError options syntactically using either names or a number. Using names, you can specify one or both of the following as the final parameter or parameters of a scaleup function call:
Instead of using names, you can use a one-digit or two-digit number from Table 2-18:
IgnoreNulls |
|||
|
|
ON |
OFF |
|
ON |
0 |
1 |
OFF |
10 |
11 |
The following examples show the use of names and a number to specify the same options (IgnoreNullsOFF and DiscardErrorOFF):
ORDSYS.TimeScale.ScaleUpAvg('My Timeseries', myTS, targetCal, ORDSYS.TimeScale.IgnoreNullsOFF, ORDSYS.TimeScale.DiscardErrorOFF); ORDSYS.TimeScale.ScaleUpAvg('My Timeseries', myTS, targetCal, 11);
Names and numbers for the options cannot be used in the same function call. For example, the following is not valid:
ORDSYS.TimeScale.ScaleUpAvg('My Timeseries', myTS, targetCal, 1, ORDSYS.TimeScale.DiscardErrorOFF);
An exception is raised if an option is specified twice or if conflicting options are specified (for example, specifying IgnoreNulls and IgnoreNullsOff in the same call). These options can be specified in any order, but they must appear after any other parameters to the function.
Oracle8i Time Series provides procedures that simplify the creation of time series schema objects. The quick-start demo (described in Section 1.6.1) illustrates the use of several of these administrative tools procedures.
Table 2-19 lists the administrative tools procedures. Reference information for these procedures is in Chapter 7.
To create, delete, and modify schema objects using the Oracle8i Time Series administrative tools procedures, you must have been granted one or more of the following roles:
For deletion of time series schema objects, the DBA and TIMESERIES_DBA roles let you delete objects that belong to any user (schema), but the TIMESERIES_DEVELOPER role lets you delete only objects that belong to the current user.
Logging, which is controlled by the Open_Log and Close_Log procedures, relies on the PL/SQL file I/O procedure UTL_FILE, which is documented in the Oracle8i Supplied Packages Reference manual.
To use logging, one or more directories for UTL_FILE output must be defined using the UTL_FILE_DIR parameter in the Oracle initialization file. For information about the UTL_FILE_DIR parameter, see the Oracle8i Reference manual.
1
A time series can be stored in a standard table; however, for performance reasons it is recommended that you use an IOT rather than a standard table.
2
An exception is the Fill function, which can be used to add pairs of timestamps and values to make a time series consistent with the calendar.