(프로젝트)파이썬, Python, streamlit_IDBM 영화 순위 데이터를 활용한 분석 및 영화 추천 프로그램

프로젝트

(프로젝트)파이썬, Python, streamlit_IDBM 영화 순위 데이터를 활용한 분석 및 영화 추천 프로그램_2. 데이터 분석

하방주인장 2023. 5. 15. 18:00

# Sidebar_1) how to use
with st.sidebar:
    st.subheader("How to use")
    st.markdown("""<span style='color:grey'>IMDB Top 250 영화 데이터를 이용하여<span/><br/>
    <span>년도별, 장르별, 감독별로 순위에<span/><br/>
    <span>가장 많이 랭킹된 Top 5를 분석합니다.:smile:<span/><br/>
    <span>컬럼을 선택하고 자세히 보고 싶은 데이터를 선택한 후<span/><br/>
    <span>상세정보가 궁금한 영화의 순위를 검색창에 입력해주세요.<span/><br/>
    <hr style='margin: 0px; border-bottom: 3px dashed rgba(49, 51, 63, 0.2);'>""", unsafe_allow_html=True)

=> streamlit의 markdown 함수에서 unsafe_allow_html=True를 선택하여 html 태그로 생성함

2-2. 컬럼 Selectbox

# Sidebar_2) 컬럼 selectbox 생성
with st.sidebar:
    group_by_option = st.selectbox('컬럼을 선택하세요: ', ['None'] + ['year'] + ['genre'] + ['director_name'])

=> 기본값인 'None'과 함께 분석의 기준이 되는 컬럼('year', 'genre', 'director_name')으로 selectbox를 구성함

3. Main

3-1. 컬럼 선택 전/후

컬럼 선택 전엔 전체 데이터프레임을 보여주고, 컬럼을 선택 시 Top 5 그래프를 보여준다.

# Main) Title
st.title(':movie_camera: IMDB Top 250 Movies')

# Main) 컬럼별 Top 5
selected_data = 'None'
if group_by_option == 'None':
    ## 컬럼을 선택하지 않으면 전체 데이터 보여주기
    main_df = df[['title', 'imbd_votes', 'imbd_rating']]

    ## 데이터프레임 인덱스 rank로 변환
    main_df.index=main_df.index+1     # 인덱스 + 1 = rank
    main_df.index.name = 'rank'                              
    st.dataframe(main_df, width=700)

else:
    ## 컬럼을 선택하면 Top 5 그래프 생성
    st.subheader(f'{group_by_option}별 가장 많이 랭킹된 Top 5')
    selected_column_df = df[group_by_option].value_counts()[:5].reset_index()       ## Top 5 추출
    selected_column_df = pd.DataFrame(selected_column_df).rename(columns={group_by_option:'count', 'index':group_by_option})
    st.bar_chart(selected_column_df, x = group_by_option, y = 'count')  # 차트 생성

    ## 선택한 컬럼 리스트 형태로 저장
    group_list = selected_column_df[group_by_option].to_list()

    # Sidebar_3) 데이터 selectbox 생성
    with st.sidebar:
        selected_data = st.selectbox('데이터를 선택해주세요: ', ['None'] + list(group_list))

=> Top 5의 결과를 토대로 데이터를 선택하는 selectbox를 생성한다.

3-2. 데이터 선택 전/후

# Main) 컬럼 선택 후 데이터 보여주기
if selected_data == 'None':
    ## 데이터를 선택하지 않으면 공간 비워두기
    st.empty()

else:
    ## 데이터를 선택하면 해당 데이터셋 보여주기
    st.subheader(f'{selected_data}의 데이터셋')
    tmp_df = df[df[group_by_option].isin([selected_data])][['title', 'imbd_votes', 'imbd_rating']]
    tmp_df.index = tmp_df.index+1
    tmp_df.index.name = 'rank'
    st.dataframe(tmp_df, width=700)

3-3. 검색창

rank(순위)를 입력하면 해당 영화 IMDB 상세 페이지로 가는 링크를 출력하는 검색창을 생성함

# Sidebar_3) 검색: 랭킹 입력 - 링크 아웃풋
with st.sidebar:
    st.markdown("<hr style='margin: 0px; border-bottom: 3px dashed rgba(49, 51, 63, 0.2);'>", unsafe_allow_html=True)
    idx = st.text_input('검색하고 싶은 순위(rank)를 입력해주세요.')
    if idx:
        st.write(df.loc[int(idx)-1]['link'])

    else:
        st.empty()

4. 결과

저작자표시 비영리 변경금지

티스토리

(프로젝트)파이썬, Python, streamlit_IDBM 영화 순위 데이터를 활용한 분석 및 영화 추천 프로그램_2. 데이터 분석

(프로젝트)파이썬, Python, streamlit_IDBM 영화 순위 데이터를 활용한 분석 및 영화 추천 프로그램_2. 데이터 분석

목차

1. Overview

2. 사이드바 만들기

2-1. How to use

2-2. 컬럼 Selectbox

3. Main

3-1. 컬럼 선택 전/후

3-2. 데이터 선택 전/후

3-3. 검색창

4. 결과