资讯专栏INFORMATION COLUMN

基于Sklearn机器学习实战---基于Sklearn模块的链路预测

BlackFlagBin / 2751人阅读

摘要:简介自年发布以来,已经成为重要的机器学习库了。简称,支持包括分类回归降维和聚类四大机器学习算法。利用这几大模块的优势,可以大大提高机器学习的效率。已经封装了大量的机器学习算法,包括和。

Sklearn简介
自2007年发布以来,scikit-learn已经成为Python重要的机器学习库了。scikit-learn简称sklearn,支持包括分类、回归、降维和聚类四大机器学习算法。还包含了特征提取、数据处理和模型评估三大模块。

  sklearn是Scipy的扩展,建立在NumPy和matplotlib库的基础上。利用这几大模块的优势,可以大大提高机器学习的效率。
  sklearn拥有着完善的文档,上手容易,具有着丰富的API,在学术界颇受欢迎。sklearn已经封装了大量的机器学习算法,包括LIBSVM和LIBINEAR。同时sklearn内置了大量数据集,节省了获取和整理数据集的时间。

项目简介
链路预测是通过历史连接信息预测未来可能产生的连接,即通过当前网络中的连边信息预测将来可能产生的连边信息。

项目源码

from sklearn.model_selection import train_test_split # 分割数据模块
from sklearn.neighbors import KNeighborsClassifier # K最近邻(kNN,k-NearestNeighbor)分类算法
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn import preprocessing
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from math import isnan

定义计算共同邻居指标的方法 define some functions to calculate some baseline index 计算Jaccard相似性指标

def Jaccavrd(MatrixAdjacency_Train):

Matrix_similarity = np.dot(MatrixAdjacency_Train,MatrixAdjacency_Train)

deg_row = sum(MatrixAdjacency_Train)
deg_row.shape = (deg_row.shape[0],1)
deg_row_T = deg_row.T
tempdeg = deg_row + deg_row_T
temp = tempdeg - Matrix_similarity

Matrix_similarity = Matrix_similarity / temp
return Matrix_similarity
定义计算Salton指标的方法

def Salton_Cal(MatrixAdjacency_Train):

similarity = np.dot(MatrixAdjacency_Train,MatrixAdjacency_Train)

deg_row = sum(MatrixAdjacency_Train)
deg_row.shape = (deg_row.shape[0],1)
deg_row_T = deg_row.T
tempdeg = np.dot(deg_row,deg_row_T)
temp = np.sqrt(tempdeg)

np.seterr(divide="ignore", invalid="ignore")
Matrix_similarity = np.nan_to_num(similarity / temp)
Matrix_similarity = np.nan_to_num(Matrix_similarity)
return Matrix_similarity

def file2matrix(filepath):

f = open(filepath)
lines = f.readlines()
matrix = np.zeros((50, 50), dtype=float)
A_row = 0
for line in lines:
    list = line.strip("
").split(" ")
    matrix[A_row:] = list[0:50]
    A_row += 1
return matrix    

filepath = "3600/s0001.txt"
MatrixAdjacency = file2matrix(filepath)

similarity_matrix_Jaccavrd = Jaccavrd(MatrixAdjacency)
similarity_matrix_Salton = Salton_Cal(MatrixAdjacency)

filepath2 = "3600/s0002.txt"
MatrixAdjacency2 = file2matrix(filepath2)

similarity_matrix_Jaccavrd2 = Jaccavrd(MatrixAdjacency2)
similarity_matrix_Salton2 = Salton_Cal(MatrixAdjacency2)

filepath3 = "3600/s0003.txt"
MatrixAdjacency3 = file2matrix(filepath3)

similarity_matrix_Jaccavrd3 = Jaccavrd(MatrixAdjacency3)
similarity_matrix_Salton3 = Salton_Cal(MatrixAdjacency3)

获取jaccard相似性矩阵的行数和列数

Jaccard_Row = similarity_matrix_Jaccavrd.shape[0]
Jaccard_Column = similarity_matrix_Jaccavrd.shape[1]
Jaccard_List = []
for i in range(Jaccard_Row):

for j in range(Jaccard_Column):
    if i
获取Salton相似性矩阵的行数和列数

Salton_Row = similarity_matrix_Salton.shape[0]
Salton_Column = similarity_matrix_Salton.shape[1]
Salton_List = []
for i in range(Salton_Row):

for j in range(Salton_Column):
    if i
获取jaccard相似性矩阵的行数和列数

Jaccard_Row2 = similarity_matrix_Jaccavrd2.shape[0]
Jaccard_Column2 = similarity_matrix_Jaccavrd2.shape[1]
Jaccard_List2 = []
for i in range(Jaccard_Row2):

for j in range(Jaccard_Column2):
    if i
获取Salton相似性矩阵的行数和列数

Salton_Row2 = similarity_matrix_Salton2.shape[0]
Salton_Column2 = similarity_matrix_Salton2.shape[1]
Salton_List2 = []
for i in range(Salton_Row2):

for j in range(Salton_Column2):
    if i
获取jaccard相似性矩阵的行数和列数

Jaccard_Row3 = similarity_matrix_Jaccavrd3.shape[0]
Jaccard_Column3 = similarity_matrix_Jaccavrd3.shape[1]
Jaccard_List3 = []
for i in range(Jaccard_Row3):

for j in range(Jaccard_Column3):
    if i
获取Salton相似性矩阵的行数和列数

Salton_Row3 = similarity_matrix_Salton3.shape[0]
Salton_Column3 = similarity_matrix_Salton3.shape[1]
Salton_List3 = []
for i in range(Salton_Row3):

for j in range(Salton_Column3):
    if i
获取邻接矩阵的行数和列数

Adjacency_Row = MatrixAdjacency.shape[0]
Adjacency_Column = MatrixAdjacency.shape[1]
Adjacency = []
for i in range(Adjacency_Row):

for j in range(Adjacency_Column):
    if i
获取邻接矩阵的行数和列数

Adjacency_Row2 = MatrixAdjacency2.shape[0]
Adjacency_Column2 = MatrixAdjacency2.shape[1]
Adjacency2 = []
for i in range(Adjacency_Row2):

for j in range(Adjacency_Column2):
    if i
获取邻接矩阵的行数和列数

Adjacency_Row3 = MatrixAdjacency3.shape[0]
Adjacency_Column3 = MatrixAdjacency3.shape[1]
Adjacency3 = []
for i in range(Adjacency_Row3):

for j in range(Adjacency_Column3):
    if i

data = np.zeros((1225,3))
data2 = np.zeros((1225,3))
data3 = np.zeros((1225,3))

for i in range(1225):

data[i][0] =  Jaccard_List[i]
data[i][1] = Salton_List[i]
data[i][2] = Adjacency[i]

for j in range(1225):

data2[j][0] =  Jaccard_List2[j]
data2[j][1] = Salton_List2[j]
data2[j][2] = Adjacency2[j]

for k in range(1225):

data3[k][0] =  Jaccard_List3[k]
data3[k][1] = Salton_List3[k]
data3[k][2] = Adjacency3[k]

data_train_X = data[:,0:2]
data_train_y = data[:,2]

data_test_X = data2[:,0:2]
data_test_y = data2[:,2]

data_target_X = data3[:,0:2]
data_target_y = data3[:,2]

knn = KNeighborsClassifier()
knn.fit(data_train_X,data_train_y)

print(knn.predict(data_test_X))

print(data_test_y)

clf = SVC()
clf.fit(data_train_X,data_test_y)

print(clf.score(data_test_X,data_target_y))

项目详细了解

如需详细本项目信息,可发送邮件至18770918982@gmail.com

文章版权归作者所有,未经允许请勿转载,若此文章存在违规行为,您可以联系管理员删除。

转载请注明本文地址:https://www.ucloud.cn/yun/43556.html

相关文章

  • ApacheCN 人工智能知识树 v1.0

    摘要:贡献者飞龙版本最近总是有人问我,把这些资料看完一遍要用多长时间,如果你一本书一本书看的话,的确要用很长时间。为了方便大家,我就把每本书的章节拆开,再按照知识点合并,手动整理了这个知识树。 Special Sponsors showImg(https://segmentfault.com/img/remote/1460000018907426?w=1760&h=200); 贡献者:飞龙版...

    刘厚水 评论0 收藏0
  • Sklearn入门介绍

    摘要:随着时代的到来及物联网概念的日益受到人们的关注,机器学习正逐步应用于科技生活生产各个领域。今天我们就为介绍机器学习中常用到的一个第三库,它是属于的第三方库,今天的讲解也是基于来进行讲解的。 随着AI时代的到来及物联网概念的日益受到人们的关注,机器学习正逐步应用于科技、生活生产各个领域。今天我们就为介绍机器学习中常用到的一个第三库Sklearn,它是属于python的第三方库,今天的讲解...

    superPershing 评论0 收藏0
  • Sklearn入门介绍

    摘要:随着时代的到来及物联网概念的日益受到人们的关注,机器学习正逐步应用于科技生活生产各个领域。今天我们就为介绍机器学习中常用到的一个第三库,它是属于的第三方库,今天的讲解也是基于来进行讲解的。 随着AI时代的到来及物联网概念的日益受到人们的关注,机器学习正逐步应用于科技、生活生产各个领域。今天我们就为介绍机器学习中常用到的一个第三库Sklearn,它是属于python的第三方库,今天的讲解...

    miracledan 评论0 收藏0

发表评论

0条评论

最新活动
阅读需要支付1元查看
<