Effective Data Infrastructure for a Data-Informed Campus
Higher education is in the midst of a remarkable period of scrutiny and change. Increased calls for accountability, the college completion agenda, and the classic debate between the liberal arts and workplace orientations have created an environment of great uncertainty. Data analytics may help higher education institutions address and meet these challenges; however, operationalizing and integrating data analytics and related technologies into the institution can be difficult.
Boggs and McPhail (2016) addressed the need for advanced technological solutions, writing that “the president of the college and the entire leadership team must strive to keep the technology infrastructure aligned with the mission, vision, and goals of the college” (p. 25). A robust technological infrastructure undergirds a successful analytics function. This includes the physical infrastructure, such as servers and cloud-based storage; logical infrastructure, such as an integrated data warehouse; and policy infrastructure, instilled via effective data governance.
Developing a data-informed campus not only involves information technology professionals but also requires the financial and operational support of executive leadership. While these individuals may not need an in-depth knowledge of data infrastructure and technical practices, a basic understanding is helpful to supporting the creation of a data-informed campus. This article introduces three primary concepts that are necessary to the creation of an effective campus data infrastructure: data integration, data governance, and business intelligence.
A reliable, accessible data system is a fundamental component of a data-informed campus. Common sense would suggest that to become data-driven, an institution must have access to its data; real-world experience, however, often suggests otherwise, as many institutions operate on dated enterprise systems not designed for operational reporting, let alone more sophisticated forms of analysis. “Maintaining a sound infrastructure can help ensure that cleaning, sharing, and using large amounts of data for making decisions institution-wide can be carried out smoothly and that different data systems can ‘speak’ to one another” (Ekowo & Palmer, 2017, p. 6). A lack of data integration, coupled with data that appear to be inaccurate or irregularly updated, negatively affects both trust and usage of technology solutions and infrastructure (Klein et al., 2019).
A data warehouse extracts and ingests information from core enterprise systems, such as the student information system and admissions or relationship management system; finance and human resources; and sponsored research. Additionally, myriad new data exist in additional systems, such as learning management systems. A data warehouse applies a series of processes to transform the raw data into structures that are better designed for reporting and business intelligence purposes (Drake & Walz, 2018).
Gathering and loading data into a data warehousing environment also allows the institution to increase the utility of the data themselves. For example, the transformation process can create derived variables that indicate a student’s class standing based on accumulated credit hours or full- or part-time status based on credit hours enrolled in a given term. Similarly, the student’s enrollment activity could be calculated in terms of its effect on full-time equivalency. Creating variables and indicators within the warehousing process makes the same established business logic and calculated results available to all users of the data warehouse. Creation of defined variables and the storage of those variables in the data warehouse help to create a semantic layer, a centralized source for data that is curated for reporting and analytics purposes. This semantic layer both increases the data’s reliability and helps facilitate data governance.
Data governance in higher education remains a fairly abstract concept. Broadly speaking, data governance incorporates organizational guidelines to ensure data quality and accountability (Weber et al., 2009). Jim and Chang (2018) created a data governance checklist that incorporates eight major criteria derived from key literature: (a) data governance body or stakeholder group, (b) data quality, (c) data access or restriction, (d) data security, (e) data stewardship, (f) ownership or roles, (g) metadata documentation or organization, and (h) business process integration.
Data governance may be operationalized through technological solutions, such as the data warehouse semantic layer or electronic data dictionaries, but it also includes a significant commitment by institutional leadership. Data governance exists on executive, strategic, and operational levels (Friedrich, 2013). Executive-level governance may set the direction and vision for data and analytics initiatives as well as resolve budget and resource issues. Strategic-level decisions may incorporate data definitions and usage criteria, data access authorizations or restrictions, and review of data for accuracy and relevance to the larger analytics sphere. Friedrich suggested a model of operational data governance that focused on defining project results, plans, and scope in greater detail. This level of governance helps to operationalize governance ideals established by the strategic and executive levels.
To facilitate a data-driven environment, the institution must have a well-designed analytics solution that combines data from a variety of disparate sources in a singular, governed environment (Ekowo & Palmer, 2017). Integrating data from the student information system, enterprise resource planning system, learning management system, and other campus data sources into a centralized data warehouse coupled with effective data governance policies contributes to an effective institutional analytics function.
Kerrigan (2014) identified access to and use of data as potential limitations to an effective data-driven environment. Specifically, respondents in the study noted the challenges of working with enterprise systems designed for neither broad use nor longitudinal analyses (Kerrigan, 2014). Business intelligence solutions are designed to allay these concerns. Business intelligence tools can provide an avenue for organizational constituents to access and interact with data. This interaction has been made easier by the addition of data warehousing on many campuses, which reorganizes and integrates data to facilitate analysis (Drake & Walz, 2018).
Davenport and Harris (2007) introduced a widely used framework on data, information, and intelligence; this framework suggests that value increases as one moves toward optimization. The lowest level of the framework is standard query and reporting activities (Davenport & Harris, 2007). Traditionally, business intelligence solutions have been report-based (Drake & Walz). These static reports—often delivered as lists, spreadsheets, or delimited text files—do not easily allow for exploration and further research. Over the past several years, data visualization has evolved as a solution within the business intelligence domain. Data visualizations help to provide structured environments for data exploration. Visualizations and dashboards can incorporate data standards and institutional data governance practices within the semantic layer, allowing end users to answer questions and perform analyses in a defined environment (Drake & Walz).
Well-designed technology solutions align with users’ needs and workflows. Misaligned or ineffective technologies contribute to both mistrust and lack of use (Klein et al., 2019). Data visualizations (see Figures 1 and 2) can be created by a centralized institutional effectiveness function or other group of specialized analysts and then shared across campus or with selected users. The visualization developer can control what data are displayed, what interactive abilities exist within the dashboard, and the available level of analysis.
When designing visualizations, one must be mindful of the needs of the end user. Knight et al. (2016) found that including end-users in dashboard development was more likely to result in “better informed development of and, ultimately, sustainable use of . . . models and dashboards” (p. 215). In addition to visualization development by a central analytics function, modern data visualization solutions also allow permissioned end users to design their own visualizations using predefined and curated data sources. This allows a degree of self-service analysis, while the centralized analytics function maintains control of data governance and validity.
Data visualization can be used to both provide accessibility to data and promote interpretability of more complex data. For example, course sequences and course content could be visualized to discover hidden relationships and course redundancies, enabling the institution to redesign or improve courses or delivery to increase efficiency (Vatrapu et al., 2011). Greer and Mark (2016) noted that data visualization can help users to identify hidden or obscure patterns in data; similarly, Bueckle et al. (2017) suggested data visualization as a method to help teachers interpret and analyze student data without a need for sophisticated mathematical knowledge.
The nature of the data and its intended purpose will often drive the choice between delivery via data visualization or more traditional business intelligence reporting. Dashboards and other visual solutions, however, offer a tremendous entry point to working with institutional data across a variety of skill levels. Executive leadership, Boards of trustees, and other high-level constituencies can access summary dashboards created by a central analytics function, while individual departments can use curated datasets to answer specific inquiries. As the data consumer becomes more familiar with the data visualization solution, they can conduct increasingly complex analysis and discover new insights.
A continued and heightening focus on accountability, institutional performance, and data-informed decision making has led to an influx of hiring of data professionals and institutional research staff. For these individuals to maximize their returns, however, they must have access to adequate data resources and systems that will allow them to perform these analyses. If something cannot be measured and the resultant data stored, it is not possible to perform anything other than a “gut feel” or purely qualitative assessment. The data integration, data governance, and business intelligence tools introduced in this article represent three ways an institution’s technology infrastructure can support efforts toward becoming a data-informed campus.
Boggs, G. R., & McPhail, C. J. (2016). Practical leadership in community colleges: Navigating today’s challenges. Jossey-Bass.
Bueckle, A., Ginda, M., Ranga Suri, N. N. R., & Börner, K. (2017). Empowering instructors in learning management systems: Interactive heat map dashboard. https://cns.iu.edu/docs/publications/2018-borner-ginda-suri-bueckle-empowering-instructors-poster.pdf
Davenport, T., & Harris, J. (2007). Competing on analytics: The new science of winning. Harvard Business Press.
Drake, B. M., & Walz, A. (2018). Evolving business intelligence and data analytics in higher education. New Directions for Institutional Research, 178, 39–52. https://doi.org/10.1002/ir.20266
Ekowo, M., & Palmer, I. (2017, March). Predictive analytics in higher education: Five guiding principles for ethical use. https://na-production.s3.amazonaws.com/documents/Predictive-Analytics-GuidingPractices.pdf
Friedrich, F. (2013, December 6). Good BI governance is just good business. EDUCAUSE Review. https://er.educause.edu/articles/2013/12/good-bi-governance-is-just-good-business
Greer, J., & Mark, M. (2016). Evaluation methods for intelligent tutoring systems revisited. International Journal of Artificial Intelligence in Education, 26, 387–392. https://doi.org/10.1007/s40593-015-0043-2
Jim, C. K., & Chang, H. (2018). The current state of data governance in higher education. In L. Freund (Ed.), Proceedings of the Association for Information Science and Technology (pp. 198–206). Wiley. https://doi.org/10.1002/pra2.2018.14505501022
Kerrigan, M. R. (2014). A framework for understanding community colleges’ organizational capacity for data use: a convergent parallel mixed-methods study. Journal of Mixed Methods Research, 8(4), 341–362. https://doi.org/10.1177/1558689814523518
Klein, C., Lester, J., Rangwala, H., & Johri, A. (2019). Technological barriers and incentives to learning analytics adoption in higher education: Insights from users. Journal of Computing in Higher Education, 31, 604–625. https://doi.org/10.1007/s12528-019-09210-5
Knight, D. B., Brozina, C., & Novoselich, B. (2016). An investigation of first-year engineering student and instructor perspectives of learning analytics approaches. Journal of Learning Analytics, 3(3), 215–238. https://doi.org/10.18608/jla.2016.33.11
Vatrapu, R., Teplovs, C., Fujita, N., Bull, S. (2011). Toward visual analytics for teachers’ dynamic diagnostic pedagogical decision-making. In Proceedings of the 1st International Conference on Learning Analytics and Knowledge (LAK ’11) (pp. 93–98). Association for Computing Machinery. https://doi.org/10.1145/2090116.2090129
Weber, K., Otto, B., & Österle, H. (2009). One size does not fit all—A contingency approach to data governance. Journal of Data and Information Quality, 1(1), Article 4. https://doi.org/10.1145/1515693.1515696
Jason P. Browning, PhD, MBA, is an experienced higher education data professional. He has led the institutional research function at the Northern Wyoming Community College District and developed the institutional effectiveness function at Dixie State University. Dr. Browning’s research focuses on institutional research, specifically how data and analytics can be used to advance higher education. Contact him at firstname.lastname@example.org.