Data for Impact Summer Institute

You are here

                                 

 

The Data for Impact Summer Institute is a ten-week project-centric program for highly motivated students focused on Data Science, Research, Analytics, Visualization, and Execution. The Institute will include workshops, taught by Lehigh faculty, which will get students up to speed on the core knowledge of data statistics, applications, analysis and visualization. For the remaining weeks, student teams will work on a wide array of interdisciplinary data-centric projects with meaningful outcomes. Projects may address compelling social, economic, population health, or community-related topics with aspirations for contributing to real sustainable impact. These projects will be faculty-guided and inquiry-driven, and prepare students to engage with faculty research while empowering them to pursue their own ideas and data-related career pathways. Project teams have the opportunity to advance their work into the academic year through the multi-year Creative Inquiry project framework.

The Data for Impact Summer Institute is offered in partnership with the Office of Creative Inquiry, the Martindale Center for the Study of Private Enterprise, and the Institute for Data, Intelligent Systems, and Computation (I-DISC). The D4I Summer Institute specifically welcomes applications from rising sophomores and juniors but is open to all Lehigh students. 

Data for Impact 2021 Projects

 

Designing Fast Machine Learning

Project Mentor: Joshua Agar, Materials Science & Engineering

Project Description: Machine learning has provided an opportunity to achieve beyond human performance in a range of tasks. A less-considered advantage is the ability of machine learning models to make complex decisions with latencies much faster than human reaction times (>300ms). Achieving FastML requires careful co-design of models, optimization methods, networking, and hardware to create end-to-end cyberphysical systems. This project will co-design machine learning models for deployment on field-programmable gate arrays for applications with hard latency constraints from milliseconds to nanoseconds. Students will have the opportunity to deploy these models for a variety of applications including additive manufacturing of materials, scanning and electron microscopy, biomedical devices, and much more.

Fellow and Associate positions available. Students from all majors are welcome to apply. 

Machine Learning for High-Speed Biomedical Imaging

Project Mentor: Yaling Liu, Bioengineering/Mechanical Engineering

Project Description: Combining the fields of data science, bioengineering, high-speed imaging, and machine learning, this project's goal is to design and build an intelligent and automated platform for high-throughput imaging cell flow cytometry. Students will use data-driven models for the classification of cell types based on optical/fluorescence cell imaging and the physical properties of cells, ultimately investigating the application of machine learning in dynamic intelligent cell sorting. The possible impact of this cell sorting system could be used for early diagnosis of tumor cells, tracking stem cell differentiation, and more. 

Fellow and Associate positions available. Interested students will ideally have a background in programming and/or hardware integration for image processing and machine learning implementation.

 

Does Inclusive Finance for the Poor Help Mitigate COVID-19's Impact on Women and Families?

Project Mentor: Todd Watkins, Economics / Martindale Center for the Study of Private Enterprise

Project Description: Students on this project will build a database that can help uncover how effective inclusive financial services and FinTech are for the communities they purport to serve-namely, women and lower socioeconomic strata. This short-term information and data gathering will pay off in a longer-term design of solutions to make these services better. The team will begin by examining the microeconomic consequences of the COVID-19 pandemic, particularly towards women in the workforce, and whether financial efforts made to help those most affected have worked at all. Future applications of this work could include broader interrogations of payday lending, youth unemployment, and global economic mobility.

Fellow and Associate positions available. Ideal students for this project have interests in one or more of the following: developing economics, microfinance, impact investing, FinTech and related financial inclusion tools. Students ideal for the team will also have some experience with statistical analysis, data visualization, and scraping data for building databases. Open to interested students from any majors or minors.

 

Practical Training and Research in Computer-Aided Drug Discovery

Wonpil Im, Bioengineering/Biological Sciences 

Project Description: One of the most remarkable features of proteins is their ability of specific, reversible binding to other molecules. Such molecular recognitions are typically associated with almost all biological functions in living systems. Drug compounds bind to proteins, regulating their functions to acquire beneficial effects to treat diseases. Therefore, a better understanding of protein-ligand interactions at the molecular level and accurate quantification or prediction of their binding affinity are at the core of computer-aided drug discovery. Students on this project will aim to study protein-ligand interactions computationally using three families of impactful therapeutic targets for cancers and AIDS. In particular, students will gain practical hands-on research experiences in computer-aided drug discovery. The lectures and tools in CHARMM-GUI will be used for student learning and research. 

Fellow and Associate positions available. Students should have some knowledge of, or interest in, coding/programming or data analysis.

Covered Interest Rate Parity in the Crypto-Currency Market

Faculty Mentor: Patrick Zoro (Finance)

Project Description: The goal of this project is to disprove the covered interest rate parity argument in the general crypto-currency decentralized finance market. Briefly, covered interest rate parity states that if you borrow in currency A and invest in currency B, and simultaneously sell currency B forward, any profit opportunity should be eliminated.

Fellow and Associate positions available.

Interpretability of a Supervised Learning-based Trading Strategy / Bitcoin Trading Strategy

Faculty Mentor: Patrick Zoro (Finance)

Project Description: The primary objective of this project is to develop an interpretability of the supervised learning-based trading strategy using different approaches, and get insight into what signals are driving a trading strategy. The strategy is based on the Random forest classifier model, and it already has the feature importance plot for interpretability. The team will generate other forms of interpretability for this approach. We will use deep ANN and not Random forest for the trading strategy and finding of interpretability techniques.

Fellow and Associate positions available.

Pennsylvania Index

Faculty Mentor: Patrick Zoro (Finance)

Project Description: This is a project to create a Pennsylvania Stock Market index and its derivatives. It will consist of first, identifying all publicly-listed companies that are headquartered in PA. Then, create a value-weighted index of these publicly-listed firms, irrespective of which industry they are in. We then monitor this index and publish the current value on the MFE website.

Fellow and Associate positions available.

DeFi (Decentralized Finance) Market Analysis

Faculty Mentor: Patrick Zoro (Finance)

Project Description: This project will analyze historic trades (confirmed transactions) monitoring large / active accounts. What tokens are they buying/selling? Where are they providing liquidity? What other insights can be gained?

Fellow and Associate positions available. 

 

Data-to-Control: Toward Data-Driven Model Predictive Control for Chemical Process Automation

Faculty Mentor: Srinivas Rangarajan (Chemical & Biomolecular Engineering)

Project Description: Most chemical and biological processes are dynamical systems. This means that their state variables (i.e. variables that characterize what state the system is in) are continuously changing. Modern plants in the energy and chemical industry have advanced data acquisition technologies, enabled in many cases by solutions offered by OSISoft LLC, the industrial partners on this project. These technologies allow for collecting, storing, and analyzing data from thousands of sensors every second (or faster). Our ultimate goal is to leverage this data to design, optimize, and control new energy and chemical systems. Students will develop algorithms that will allow us to extract the underlying ordinary differential equations from time-varying data. This algorithm will then allow us to take time-varying plant data and build data-driven dynamic equations that accurately captures the overall process. We specifically intend to build on the state-of-the-art algorithms from the applied mathematics community on inferring equations from data that have been successfully applied in the fluid mechanics domain by incorporating a number of new features including the concept of infusing chemical engineering domain knowledge as constraints while training the data-driven model.

CLICK HERE for Mountaintop project descriptions

Program Info / Proposal Format and Submission Process

General Program Info

The Mountaintop Summer Experience (MTSE) and Data for Impact Summer Institute (D4I) run for 10 weeks (2021 program dates are June 1st - August 6th). At this time, both programs are scheduled to operate fully virtually, though some in-person work may be able to happen in Building C or elsewhere on campus, safety considerations and protocols depending.

Most projects are part of Mountaintop, which is open to projects from all disciplines, fields, and areas of inquiry. Projects with a specific focus on, or need for, data science, analysis, visualization etc., can be considered as a special subcategory of Data for Impact. Creative Inquiry partners with the Martindale Center and the Institute for Data, Intelligent Systems, and Computation (I-DISC) to design the D4I program and select projects. The D4I program includes some data-focused workshops during the first weeks of the summer, in order to bring students up to speed with core concepts, methodologies, and praxis.

All summer program projects are expected to continue into the Fall 2021 semester (and, preferably, beyond), through the CINQ courses.

Funding - Fellows and Associates

Students selected as “Fellows” in Mountaintop or Data for Impact projects are paid a stipend of $4,000 (for undergraduates), or $5,500 (for graduate students), for the summer, paid in five biweekly installments beginning mid-June. Projects typically have two, or at most three, Fellow opportunities. Fellows are expected to make a full-time commitment for the 10 weeks of the program, and are not allowed to take summer courses or have jobs/internships/vacations which would substantially take them away from their project work. Fellows are also expected to continue working on their projects through the Fall semester (and possibly beyond) by enrolling in the CINQ courses for credit.

“Associates” on Mountaintop or Data for Impact projects are not paid a stipend, and make a part-time commitment (approx. 10 hours/week) to their project work. Associates are invited and encouraged to attend all program presentations, workshops, and guest speakers. Associates are also strongly encouraged to continue working through the CINQ courses in the Fall semester and beyond. Students not receiving a stipend who have high financial need may inquire about financial assistance that may be available.

Lead faculty project mentors in both programs can utilize up to $500 in discretionary funds. These funds are not tracked by the Office of Creative Inquiry, and faculty may use them as they choose. However, the intention of these funds is to help disseminate the work of the project (through conference registration fees and the like) and/or to supplement (or in some cases support) project expense budgets.  **NOTE:  These funds are subject to adjustment or change if unexpected budgetary constraints arise**

Project expense budgets (for needed equipment, supplies, reagents, resources, etc.) will be considered as requested.

Student Selection

Applications are open from March 25th through April 5th for both Mountaintop and Data for Impact programs. The same application is used for both MTSE and D4I programs. The application form includes a list of available projects, and students will be asked to rank their 1st, 2nd, and 3rd choice of projects (one 3rd-choice option will be “any”). After applications close, faculty project mentors will be given a list of student applicants who indicated an interest in their project in the application process, and will contact those students for an interview. The Office of Creative Inquiry will not select students for specific projects; those decisions are entirely at the discretion of the faculty lead mentor(s). Faculty mentors should inform the Office of Creative Inquiry of their student selections no later than Friday, April 23rd and students should be informed by the end of April 2021 of their selection.

Faculty lead mentors are welcome and encouraged to recruit students for their own projects. Those students will need to fill out an application form to the program, but will be automatically accepted based on faculty recommendation / request.

Program Resources

Both MTSE and D4I will offer developmental, informational, and practical workshops and activities, as well as an array of guest “Innovators in Residence,” many of whom will offer guest lectures and/or meet with individual teams who could benefit from their expertise. Fellows are expected to attend these; Associates are strongly encouraged. Students in both programs will be asked to regularly present on their progress in formal and informal settings, to other program participants and external audiences (through a Summer Expo). 

Faculty Mentor Expectations

The expectations for faculty mentors: faculty are not expected to “teach” as part of these programs. Projects proposed should have direct relevance to the faculty mentor’s research agenda and/or impact agenda. The most successful MTSE and D4I students are those who demonstrate self-efficacy, an execution-focused mindset, strong teamwork skills, openness to learning new skill sets and mindsets, and a comfort level with open-ended questions and problems. The primary roles of the faculty mentor are to provide advice, resources, and guidance syncing up with the team on a regular basis (at least weekly) to monitor progress. 

Students: What’s in it for me?

  • Develop / strengthen skill sets in data science, analysis and visualization: a growing field for all disciplines that opens new career pathways;

  • Experience working with interdisciplinary teams and driving research forward collaboratively with student, faculty, and external partners;

  • Opportunities to work collaboratively with other self-selected, motivated students from across the university on ambitious data-centric projects;

  • Applying principles of data science to high-impact, real-world projects / applying knowledge to practical execution;
  • Learning fast, learning big, learning to fail and reiterate, and learning on the go;

  • Prepare your resume to join faculty research endeavors, pursue your own ideas for new ventures, and qualify for new internship and career opportunities.

Data for Impact Summer Institute 2020
The Data for Impact Summer Institute debuted this past summer. Eighty-six students participated in 15 different projects this year. Programs and Faculty mentors are listed below.

Data for Impact Summer Institute 2020 Projects (Click the image for additional information)