The Data for Impact Summer Institute is a ten-week project-centric program for highly motivated students focused on Data Science, Research, Analytics, Visualization, and Execution. The Institute will include workshops, taught by Lehigh faculty, which will get students up to speed on the core knowledge of data statistics, applications, analysis and visualization. For the remaining weeks, student teams will work on a wide array of interdisciplinary data-centric projects with meaningful outcomes. Projects may address compelling social, economic, population health, or community-related topics with aspirations for contributing to real sustainable impact. These projects will be faculty-guided and inquiry-driven, and prepare students to engage with faculty research while empowering them to pursue their own ideas and data-related career pathways. Project teams have the opportunity to advance their work into the academic year through the multi-year Creative Inquiry project framework.
The Data for Impact Summer Institute is offered in partnership with the Office of Creative Inquiry, the Martindale Center for the Study of Private Enterprise, and the Institute for Data, Intelligent Systems, and Computation (I-DISC). The D4I Summer Institute specifically welcomes applications from rising sophomores and juniors but is open to all Lehigh students.
2021 Data for Impact Projects
Covered Interest Rate Parity in the Crypto-Currency Market
Faculty Mentor: Patrick Zoro (Finance)
Description: The goal of this project is to disprove the covered interest rate parity argument in the general crypto-currency decentralized finance market. Briefly, covered interest rate parity states that if you borrow in currency A and invest in currency B, and simultaneously sell currency B forward, any profit opportunity should be eliminated.
DeFi (Decentralized Finance) Market Analysis
Faculty Mentor: Patrick Zoro (Finance)
Description: This project will analyze historic trades (confirmed transactions) monitoring large / active accounts. What tokens are they buying/selling? Where are they providing liquidity? What other insights can be gained?
Designing Fast Machine Learning
Faculty Mentor: Joshua Agar (Materials Science & Engineering)
Description: Machine learning has provided an opportunity to achieve beyond human performance in a range of tasks. A less-considered advantage is the ability of machine learning models to make complex decisions with latencies much faster than human reaction times (>300 ms). Achieving FastML requires careful codesign of models, optimization methods, networking, and hardware to create end-to-end cyberphysical systems. This project will codesign machine learning models for deployment on field-programmable gate arrays for applications with hard latency constraints from milliseconds to nanoseconds. Students will have the opportunity to deploy these models for a variety of applications including additive manufacturing of materials, scanning and electron microscopy, biomedical devices, and much more.
Does Inclusive Finance for the Poor Help Mitigate COVID-19’s Impact on Women and Families?
Faculty Mentor: Todd Watkins (Economics/Martindale Center)
Description: Students on this project will build a database that can help uncover how effective inclusive financial services and fintech are for the communities they purport to serve – namely, women and lower socioeconomic strata. This short-term information and data gathering will pay off in a longer-term design of solutions to make these services better. The team will begin by examining the microeconomic consequences of the COVID-19 pandemic, particularly towards women in the workforce, and whether financial efforts made to help those most affected have worked at all. Future applications of this work could include broader interrogations of payday lending, youth unemployment, and global economic mobility.
Interpretability of a Supervised Learning-based Trading Strategy / Bitcoin Trading Strategy
Faculty Mentor: Patrick Zoro (Finance)
Description: The primary objective of this project is to develop an interpretability of the supervised learning-based trading strategy using different approaches, and get insight into what signals are driving a trading strategy. The strategy is based on the Random forest classifier model, and it already has the feature importance plot for interpretability. The team will generate other forms of interpretability for this approach. We will use deep ANN and not Random forest for the trading strategy and finding of interpretability techniques.
Machine Learning for High-Speed Biomedical Imaging
Faculty Mentor: Yaling Liu (Bioengineering / Mechanical Engineering)
Description: Combining the fields of data science, bioengineering, high-speed imaging, and machine learning, this project’s goal is to design and build an intelligent and automated platform for high-throughput imaging cell flow cytometry. Students will use data-driven models for the classification of cell types based on optical/fluorescence cell imaging and the physical properties of cells, ultimately investigating the application of machine learning in dynamic intelligent cell sorting. The possible impact of this cell sorting system could be used for early diagnosis of tumor cells, tracking stem cell differentiation, and more.
Faculty Mentor: Patrick Zoro (Finance)
Description: This is a project to create a Pennsylvania Stock Market index and its derivatives. It will consist of first, identifying all publicly-listed companies that are headquartered in PA. Then, create a value-weighted index of these publicly-listed firms, irrespective of which industry they are in. We then monitor this index and publish the current value on the MFE website.
Practical Training and Research in Computer-Aided Drug Discovery
Faculty Mentor: Wonpil Im (Bioengineering / Biological Sciences)
Description: One of the most remarkable features of proteins is their ability of specific, reversible binding to other molecules. Such molecular recognitions are typically associated with almost all biological functions in living systems. Drug compounds bind to proteins, regulating their functions to acquire beneficial effects to treat diseases. Therefore, a better understanding of protein-ligand interactions at the molecular level and accurate quantification or prediction of their binding affinity are at the core of computer-aided drug discovery. Students on this project will aim to study protein-ligand interactions computationally using three families of impactful therapeutic targets for cancers and AIDS. In particular, students will gain practical hands-on research experiences in computer-aided drug discovery. The lectures and tools in CHARMM-GUI (http://www.charmm-gui.org/lecture) will be used for student learning and research.
Sports Analytics aka Moneyball
Faculty Mentor: Patrick Zoro, Finance
Description: Our goal is to explore line changes and discrepancies to develop an algorithmic trading strategy to create risk-free arbitrage returns. Once this strategy is developed, we then wish to develop further strategies with statistical arbitrage using established sports probability models such as Sabremetrics. Additionally, we would explore developing a strategy considering sharp money sentiment from the market to ensure our other strategies are focused on the most volatile line changes with the largest discrepancies.
CLICK HERE for Mountaintop project descriptions
Program Info / Proposal Format and Submission Process
The Mountaintop Summer Experience (MTSE) and Data for Impact Summer Institute (D4I) run for 10 weeks (2021 program dates are June 1st - August 6th). At this time, both programs are scheduled to operate fully virtually, though some in-person work may be able to happen in Building C or elsewhere on campus, safety considerations and protocols depending.
Most projects are part of Mountaintop, which is open to projects from all disciplines, fields, and areas of inquiry. Projects with a specific focus on, or need for, data science, analysis, visualization etc., can be considered as a special subcategory of Data for Impact. Creative Inquiry partners with the Martindale Center and the Institute for Data, Intelligent Systems, and Computation (I-DISC) to design the D4I program and select projects. The D4I program includes some data-focused workshops during the first weeks of the summer, in order to bring students up to speed with core concepts, methodologies, and praxis.
All summer program projects are expected to continue into the Fall 2021 semester (and, preferably, beyond), through the CINQ courses.
Students selected as “Fellows” in Mountaintop or Data for Impact projects are paid a stipend of $4,000 (for undergraduates), or $5,500 (for graduate students), for the summer, paid in five biweekly installments beginning mid-June. Projects typically have two, or at most three, Fellow opportunities. Fellows are expected to make a full-time commitment for the 10 weeks of the program, and are not allowed to take summer courses or have jobs/internships/vacations which would substantially take them away from their project work. Fellows are also expected to continue working on their projects through the Fall semester (and possibly beyond) by enrolling in the CINQ courses for credit.
“Associates” on Mountaintop or Data for Impact projects are not paid a stipend, and make a part-time commitment (approx. 10 hours/week) to their project work. Associates are invited and encouraged to attend all program presentations, workshops, and guest speakers. Associates are also strongly encouraged to continue working through the CINQ courses in the Fall semester and beyond. Students not receiving a stipend who have high financial need may inquire about financial assistance that may be available.
Lead faculty project mentors in both programs can utilize up to $500 in discretionary funds. These funds are not tracked by the Office of Creative Inquiry, and faculty may use them as they choose. However, the intention of these funds is to help disseminate the work of the project (through conference registration fees and the like) and/or to supplement (or in some cases support) project expense budgets. **NOTE: These funds are subject to adjustment or change if unexpected budgetary constraints arise**
Project expense budgets (for needed equipment, supplies, reagents, resources, etc.) will be considered as requested.
Applications are open from March 25th through April 5th for both Mountaintop and Data for Impact programs. The same application is used for both MTSE and D4I programs. The application form includes a list of available projects, and students will be asked to rank their 1st, 2nd, and 3rd choice of projects (one 3rd-choice option will be “any”). After applications close, faculty project mentors will be given a list of student applicants who indicated an interest in their project in the application process, and will contact those students for an interview. The Office of Creative Inquiry will not select students for specific projects; those decisions are entirely at the discretion of the faculty lead mentor(s). Faculty mentors should inform the Office of Creative Inquiry of their student selections no later than Friday, April 23rd and students should be informed by the end of April 2021 of their selection.
Faculty lead mentors are welcome and encouraged to recruit students for their own projects. Those students will need to fill out an application form to the program, but will be automatically accepted based on faculty recommendation / request.
Both MTSE and D4I will offer developmental, informational, and practical workshops and activities, as well as an array of guest “Innovators in Residence,” many of whom will offer guest lectures and/or meet with individual teams who could benefit from their expertise. Fellows are expected to attend these; Associates are strongly encouraged. Students in both programs will be asked to regularly present on their progress in formal and informal settings, to other program participants and external audiences (through a Summer Expo).
The expectations for faculty mentors: faculty are not expected to “teach” as part of these programs. Projects proposed should have direct relevance to the faculty mentor’s research agenda and/or impact agenda. The most successful MTSE and D4I students are those who demonstrate self-efficacy, an execution-focused mindset, strong teamwork skills, openness to learning new skill sets and mindsets, and a comfort level with open-ended questions and problems. The primary roles of the faculty mentor are to provide advice, resources, and guidance syncing up with the team on a regular basis (at least weekly) to monitor progress.
Students: What’s in it for me?
Develop / strengthen skill sets in data science, analysis and visualization: a growing field for all disciplines that opens new career pathways;
Experience working with interdisciplinary teams and driving research forward collaboratively with student, faculty, and external partners;
Opportunities to work collaboratively with other self-selected, motivated students from across the university on ambitious data-centric projects;
- Applying principles of data science to high-impact, real-world projects / applying knowledge to practical execution;
Learning fast, learning big, learning to fail and reiterate, and learning on the go;
Prepare your resume to join faculty research endeavors, pursue your own ideas for new ventures, and qualify for new internship and career opportunities.