CRSP Link®
Overview
CRSP and Compustat data are commonly linked to match CRSP event and market data history with Compustat fundamental and supplemental data. Because of different identification conventions, universes, available historical information, and conventions unique to each organization, linking is not a straightforward process. Through using the CRSP Link, a data array which contains a history of links using CRSP and Compustat identifiers, subscribers may accurately combine CRSP and Compustat data into a single source of clean, reliable data.
Compustat Xpressfeed provides new security level data requiring adjustments to the linking process between CRSP and Compustat databases. Previously, Compustat included one security per record. Now all securities are available with a new identifier, IID, which can be used along with GVKEY to permanently identify all securities tracked by Compustat, and marker items that identify the security that Compustat considers Primary.
CRSP provides two views of the data through the CRSP Link. While the standard form is the native data and linking information that is organized by Compustat GVKEY, CRSP provides tools to use the link to build CRSP-centric records linked by PERMNO, as needed.
Identifiers used by the link:
- GVKEY
Compustat’s permanent company identifier.
- IID
Compustat’s permanent issue identifier. An identifying relationship exists between IID and GVKEY. Both must be accessed as a pair to properly identify a Compustat security. One GVKEY can have multiple IIDs.
-
Because Compustat company data ranges can extend earlier than security ranges, there may be some time periods with no identified IID for a GVKEY. In these cases, CRSP assigns a dummy IID ending in “X” as a placeholder in the link. This range may or may not be associated with a CRSP PERMNO, but there is no Compustat security data found during the range when no IID is assigned.
- PRIMISS
Compustat provides a primary marker indicating which security is considered primary for a company at a given time.
- PERMCO
CRSP’s permanent company identifier.
- PERMNO
CRSP’s permanent issue identifier. There is a non-identifying relationship between PERMNO and PERMCO. One PERMNO belongs to only one PERMCO. One PERMCO can have one or more PERMNOs.
The Linking Process
Prior to the introduction of Xpressfeed, Compustat included only one security per record. The links between CRSP and Compustat were between CRSP PERMNO and Compustat GVKEY. Because PERMNO is a security identifier and GVKEY is a company identifier, the linking could be a many to one relationship. More than one PERMNO may be linked to a single GVKEY.
CRSP addressed the security links in phases. The initial phase addressed security links for issues after mid-April in 2007, for that was when the first Compustat security-level information was available. In this phase, links prior to this time were maintained by using the old CST link information as a foundation onto which updates and refinements were applied.
The primary goal of the second phase of building the security links was to remove the April 2007 starting limitation to the security-based links and move to a full security link history. Once the full security history was built, it would be used to generate company –based historical linking broken down into primary issue ranges and indicators.
This process is laborious and demanding of CRSP researchers and programmers. The new links are reflected beginning with the release of the 200806 annual (CMX200806) and the 200810 monthly and quarterly (CMX200810) release.
Native Link Access
The native link, accessing data using GVKEY, GVKEY.IID, and GVKEYX is used to access all Compustat data including index data, Canadian records, and off-exchange ranges that cannot be directly linked to CRSP securities. The native link reads Compustat data as organized and identified by Compustat identifiers and can choose the CRSP data associated with those records. Decisions on handling overlaps or soft links are left to the user.
CRSP provides security level link data with a flag, PRIMFLAG, indicating whether or not each link is to Compustat’s identified primary issue. The primary issue flag can be used to restrict the link to one security per company for each range as it was done with the original CRSP link. Primary issue flags are P, Primary as identified by Compustat, or C, Primary assigned by CRSP.
Example: Accessing two separate GVKEYs in Native Mode from the Link table, see that both share a single PERMNO.
GVKEY = 011947
Link History ------------ LINKDT LINKENDDT LPERMNO LPERMCO LIID LINKTYPE LINKPRIM 19820701 19860304 0 0 00X NR C 19860305 19890228 10083 8026 01 LU P
GVKEY = 015495
Link History ------------ LINKDT LINKENDDT LPERMNO LPERMCO LIID LINKTYPE LINKPRIM 19880101 19890227 0 0 00X NR C 19890228 19930909 10083 8026 01 LC C 19930910 19990304 0 0 01 NR C - Delisted
CRSP_CCM_LINK – Security Link History
Only one set of link information is presented for each calendar range in the Compustat GVKEY and IID history. Soft LX and LD links are included if there is a match that indicates an alternate record or a security on a non-US exchange. CRSP provides no automated methods to use these soft links to connect to CRSP data, but the information is available for the user.
Native Link usage provides access to all Compustat records, regardless of whether or not securities are in the CRSP universe.
| Itm_name | Type | Description |
|---|---|---|
| GVKEY* | integer, primary key (1) | Compustat GVKEY |
| LIID | char(3), primary key (2) | Compustat IID. Dummy IID assigned with an “X” suffix during a range when company data exists but no Compustat security is identified. |
| LINKDT | integer (date), primary key (3) | First effective calendar date of link record range |
| LINKENDDT | integer (date) | Last effective calendar date of link record range |
| LPERMNO | integer | Linked CRSP PERMNO, 0 if no CRSP security link exists |
| LPERMCO | integer | Linked CRSP PERMCO, 0 if no CRSP company link exists |
| LINKPRIM | char(3) | Primary issue marker for the link. Based on Compustat Primary/Joiner flag (PRIMISS), indicating whether this link is to Compustat’s marked primary security during this range.
|
| LINKTYPE | char(3) |
|
* - The GVKEY is the primary key of all Compustat company records when using the native link. In CRSPAccess programming this field is not present in the structure but inherited from the CCMID item in the master structure for the company. In standalone usage the GVKEY field is included.
CRSP-Centric Link Usage
Accessing Compustat data through ts_print and TsQuery is done through the CRSP-centric mode, meaning that the primary access key in this mode is CRSP PERMNO rather than GVKEY, as used in the Native Access mode. The CRSP identifiers are the access keys while the Compustat identifiers become attributes. There are two options: Primary only, which mirrors the company-level link by ignoring links not to the primary security, and All, which allows use of any link to the PERMNO.
In CRSP-Centric mode a composite record is built using the CRSP Link reading one or more GVKEYs. All GVKEYS with some presence of the PERMNO in the link are accessed. A used-link history is built from these link records by identifying those that cover the ranges of Compustat data needed to link to the CRSP identifier. For time series items that are stored on a fiscal period basis, the link ranges are translated to a fiscal range. This translation simplifies the selection of fundamental data that are applicable to the range and allows for the creation of a composite Compustat record from the applicable ranges that correspond to a CRSP security.
Records in CRSP-Centric form are identical in layout to the native records, but use CRSP PERMNO as the effective key. The Compustat component identifiers – GVKEY, IID, and PRIMISS are available in a Link Used table in the CRSP records.
Using the CRSP-Centric view simplifies access when viewing Compustat data through CRSP. One drawback, however, is that only data considered a direct link to CRSP, applied using CRSP link rules, are available.
The example that followed accessed data natively, then through the CRSP-centric view using PERMNO.
Example: Accessing two separate GVKEYs from the Link table, see that both share a single PERMNO.
GVKEY = 011947
Link History ------------ LINKDT LINKENDDT LPERMNO LPERMCO LIID LINKTYPE LINKPRIM 19820701 19860304 0 0 00X NR C 19860305 19890228 10083 8026 01 LU P
GVKEY = 015495
Link History ------------ LINKDT LINKENDDT LPERMNO LPERMCO LIID LINKTYPE LINKPRIM 19880101 19890227 0 0 00X NU C 19890228 19930909 10083 8026 01 LC C 19930910 19990304 0 0 01 NR C
Using CRSP-Centric access, the LINKUSED data show which GVKEYs and IIDs are used to build a composite record by PERMNO. Only the rows with USEDFLAG=1 show the GVKEYs and calendar ranges used to build the composite record for PERMNO 10083. table, access the composite history using the Primary PERMNO (LINKPRIM=P)
PERMNO = 10083
Link Used --------- LINKDT LINKENDDT GVKEY IID LINKID PERMNO PERMCO USEDFLAG LINKPRIM LINKTYPE 19820701 19860304 11947 00X 5 0 0 -1 C NR 19860305 19890228 11947 01 6 10083 8026 1 P LU 19880101 19890227 15495 00X 0 0 0 -1 C NU 19890228 19930909 15495 01 1 10083 8026 1 C LC 19930910 19990304 15495 01 2 0 0 -1 C NR 19990305 20051019 15495 01 3 86787 16430 -1 C LC 20051020 99999999 15495 01 4 0 0 -1 C NR
CRSP_CCM_LINKUSED – CRSP-Centric Link Used History
| Itm_name | Type | Description |
|---|---|---|
| PERMNO* | integer, primary key (1) | CRSP PERMNO used as basis for this history |
| ULINKID | integer | Unique ID per link associated with PERMNO. This is used to join with range data in the LINKRANGE table that describes the data ranges applied from used GVKEYs. |
| UGVKEY | integer | Compustat GVKEY |
| UIID | char(3) | Compustat IID |
| ULINKDT | integer (date), primary key (2) | First effective calendar date of link record range |
| ULINKENDDT | integer (date) | Last effective calendar date of link record range |
| UPERMNO | integer | Linked CRSP PERMNO, 0 if no CRSP security link exists |
| UPERMCO | integer | Linked CRSP PERMCO, 0 if no CRSP company link exists |
| ULINKPRIM | char(3) | Primary issue marker for the link. Based on Compustat Primary/Joiner flag (PRIMISS), indicating whether this link is to Compustat’s marked primary security during this range.
|
| ULINKTYPE | char(3) | Link type code. Each link is given a code describing the connection between the CRSP and Compustat data. Values are:
|
| USEDFLAG | integer |
|
* - The PERMNO is the CRSP security identifier used as the basis for a composite Compustat record and serves as the primary identifier for the composite record. In CRSPAccess programming this field is not present in the structure but inherited from the master structure for the company. The APERMNO or PPERMNO key types store the PERMNO in the CCM structure CCMID field and marks the CCMIDTYPE as 3. In standalone usage the PERMNO field is included.
CRSP_CCM_LINKRNG – CRSP-Centric Link History Range
The link history is presented by calendar range. If data are presented on a fiscal basis the calendar dates must be interpreted as the proper fiscal period. In this case there can be overlaps generated when links change across GVKEYS or fiscal year end month changes.
CRSP generates a range table with information on the fiscal periods associated with each used link for each time series calendar frequency and keyset. This shows ranges in each of the fiscal and calendar calendars available in the CCM. When there is an overlap and used links provide data for the same fiscal period, the link with the latest filing data date is chosen for the fiscal period. This range table shows the ranges from the GVKEY for each type of time series data used to build the composite record for the PERMNO selected.
| Itm_name | Type | Description |
|---|---|---|
| PERMNO* | integer, primary key 1 | PERMNO key built |
| RLINKID | integer, primary key 2 | unique ID set in the link used record, used for joining range data with the appropriate link. |
| RKEYSET | integer, primary key 3 | Keyset of time series object |
| RCALID | integer, primary key 4 | CRSP calendar of time series |
| RFISCAL_DATA_FLG | char(1) | Type of time series data, F = fiscal, C= calendar |
| RBEGIND | integer | first index in time series with valid data for this used link |
| RENDIND | integer | last index in time series with valid data for this used link |
| RPREVIND | integer | index of previous data |
| RBEGDT | integer | first calendar date in time series with valid data for this used link. |
| RENDDT | integer | last calendar date in time series with valid data for this used link |
| RPREVDT | integer | date of previous data |
* - see note on CRSP_CCM_LINKUSED PERMNO.
Link Actions
This table shows the types of links that are supported by the CRSP CCM link and how they are achieved. A date range is associated with each link so all actions imply an event history.
| # | Action | Input Identifier Type | Output Identifier Type | Link Table |
|---|---|---|---|---|
| 1 | Find all securities in CRSP for Compustat Company data | GVKEY | PERMNO (PERMCO) | crsp_ccm_link (all links used) |
| 2 | Find primary security in CRSP for Compustat Company data | GVKEY | PERMNO | crsp_ccm_link (only links where LINKPRIM is P or C) |
| 3 | Find data in CRSP for a specific Compustat Company and issue | GVKEY/IID | PERMNO | crsp_ccm_link (links with desired IID) |
| 4 | Find Compustat data for a given CRSP security. | PERMNO | GVKEY/IID | crsp_ccm_linkused (history used to build a composite GVKEY record in link used) |
| 5 | Find Compustat company and security data for a CRSP security, only if it is considered primary. | PERMNO | GVKEY/IID | crsp_ccm_linkused (only use links where LINKPRIM is P or C) |
Link Action Notes:
- CRSP_CCM_LINK contains valid links for all securities provided by Compustat. Each record with a valid link to a PERMNO can be followed to the appropriate CRSP data. The user has the option of restricting links by LINKTYPE to ignore soft links, and using the CRSP PERMCO to identify other issues of the same company not addressed in the link. All PERMNOs found with this method share the company-level data from the GVKEY. The link record IID is needed to match the CRSP PERMNO data to the proper Compustat security level data.
- Link records with the security not marked Primary are ignored. Otherwise this is the same as #1. The result is that even if multiple CRSP PERMNOs are found, there should be no overlap in the CRSP history used. All PERMNOs found will share the company-level data from the GVKEY, but will match only the Compustat IID indicated in the link record.
- Given a GVKEY and IID from Compustat, use CRSP_CCM_LINK to get the history of CRSP PERMNOs linked to that company and security. The user has the option of restricting soft links using LINKTYPE. No consideration is given to whether the security is considered primary any time during its history. The link can produce multiple CRSP PERMNOs, but only one link should be found at any time.
- Given a CRSP PERMNO, use CRSP_CCM_LINKUSED to find Compustat data. Access with APERMNO key type will build a composite GVKEY record from the used link records. CRSP_CCM_LINKRNG is used to find ranges of data for the composite record. Secondary links are ignored, and only the Compustat security data matching the permno are included. There will be one composite security record created with a pseudo IID of 01X.
- Same as #4, but a link record is ignored if the security matched is not primary. This will result in a smaller range, and a not-found if the PERMNO is never primary for the company. Access with PPERMNO key type is used to select this method.
- PERMCO is not directly supported with linkused, but attached PERMNOs can be found from the PERMCO and the user can select securities with PERMNO. To avoid double-counting company data, the primary flag can be used to ensure that only one security is represented during each time range.
4,5. A user can use secondary index on PERMNO or PERMCO to find GVKEYs with matching information and see the Compustat data in native form, then handle processing as desired. These reads are not necessarily unique, so it is left to the user to select information from the correct ranges corresponding to the desired CRSP identifier.
Table vs. CRSPAccess Usage Notes
The Link Actions table includes the primary identifiers for the databases: GVKEY for CCM and PERMNO for CRSP Stock. In a standalone setup where data are dumped and stored as a table these identifiers are included in each table and used to join data.
CRSPAccess programming access always organizes all data for one GVKEY (CCM) or PERMNO (CRSP Stock) in a single structure. The primary identifier is set at the full structure level and inherited by all substructures. Therefore the field is not explicitly included in the substructures. When a CCM composite record is built by the crsp_ccm_read_all function the primary identifier becomes the PERMNO used as the key, which is stored in the CCM_ID field of this structure. The LOADTYPE flag is set to 1 to signify that the structure is loaded with a composite record.
Security Level Link Data Considerations
Consider the following in order to access the new security level link data.
- Additional security links allow multiple PERMNOs of the same company to link to the same company level data. Users must be aware that the same company data can be retrieved in multiple ways.
- The PERMCO link is no longer needed since a secondary security can link directly between CRSP and Compustat. PERMCO can still be used to find other securities when no direct link is found.
- Security level links are available only during the range of Compustat security data. In some cases, Compustat security data are not available as far back as company data. In others, there may be gaps of security data within a company range. CRSP fills in the available Compustat company data range so at least one link record covers all time periods in the range. If no securities are available during a range, a dummy security is generated for purposes of the link. These dummy securities always have an IID ending with X.
- CRSP assigns a LINKPRIM marker to all link records, based on the Compustat PRIMISS marker. PRIMISS is used to identify the primary security for the company at any given time. LINKPRIM values are:
- P
Primary, identified by Compustat in monthly security data.
- J
Joiner secondary issue of a company, identified by Compustat in monthly security data.
- C
Primary, assigned by CRSP to resolve ranges of overlapping or missing primary markers from Compustat in order to produce one primary security throughout the company history.
- N
Secondary, assigned by CRSP to override Compustat. Compustat allows a US and Canadian security to both be marked as Primary at the same time. For Purposes of the link, CRSP allows only one primary at a time and marks the others as N.
- CRSP supports an access option of primary PERMNO, or ppermno, which restricts links to only those marked primary.
- The legacy CST format databases remain based on the old company-based links, thus using only the rows marked as primary.